Nowadays, several environmental applications take advantage of remote sensing techniques.A considerable volume of this remote sensing Toner Originals data occurs in near real-time.Such data are diverse and are provided with high velocity and variety, their pre-processing requires large computing capacities, and a fast execution time is critical.
This paper proposes a new distributed software for remote sensing data pre-processing and ingestion using cloud computing technology, specifically OpenStack.The developed software discarded 86% of the unneeded daily files Single Inlet 2 Way Water Valve and removed around 20% of the erroneous and inaccurate datasets.The parallel processing optimized the total execution time by 90%.
Finally, the software efficiently processed and integrated data into the Hadoop storage system, notably the HDFS, HBase, and Hive.