There are many well-established approaches to model outdoor environments but it has been a challenge to come up with an accurate and reliable approach for indoor modelling. Outdoor automatic modelling and mapping are achieved using satellite positioning systems like GPS. However, for indoor modelling, due to localization problems, positioning systems do not help. Therefore, the indoor positioning is done by combining inertial sensors along with area learning. Previously, the 3D models for indoor environments were done either manually which is a time-consuming process or by using range images obtained from laser range scanners which are an expensive approach. For overcoming this drawback, we suggest utilizing crowd sensing in order to obtain the environment’s spatial information. Crowd-sensing is performed by a group of participants using their mobile devices to execute certain assigned tasks. Nowadays modern mobile devices are used for many crowd sensing applications with the help of the ubiquitous presence of such powerful devices. In our approach, the crowd sensing task involves collecting scans of rooms in public buildings using these mobile devices. Thus, we depend on the new wave of powerful devices, e.g. Google Tango, Microsoft Hololens and Apple ARKit which generates optimal scanned data i.e. 3D point clouds which can be crowd sensed and then processed to automatically generate indoor models. These devices provide capabilities like depth perception, area learning and motion tracking, which help to acquire the spatial information of the room and its relative position. Even though there is more energy consumption on the mobile device which is responsible for collecting large dataset, this can be reduced by applying Octree compression which reduces the amount of data of the scanned point clouds. The scanned data is sent to the server where it gets processed and generates the required indoor model. The proposed tool will derive the 3D indoor model of a floor. The tool is capable of extracting all the planes from the point clouds, detecting all the room surfaces (i.e. at least four walls, a ceiling and a floor), classifying wall openings and classifying high-level semantics (i.e. furniture). The basic model for all the rooms in the floor is generated and in order to build a floor model, the ground truth of that floor is utilized. Finally, the accuracy of the basic model is enhanced using the grammar model fitting tool. Using the point cloud scans from one of our University campuses, the final indoor modelling tool was able to derive the complete 3D floor plan in compliance with the ground truth.