As the most comprehensive document types for the recording and display of real-world information regarding construction projects, 3D realistic models are capable of recording and displaying simultaneously textures and geometric shapes in the same 3D scene. However, at present, the documentation for much of construction infrastructure faces significant challenges. Based on TLS, GNSS/IMU, mature photogrammetry, a UAV platform, computer vision technologies, and AI algorithms, this study proposes a workflow for 3D modeling of complex structures with multiple-source data. A deep learning LoFTR network was used first for image matching, which can improve matching accuracy. Then, a NeuralRecon network was employed to generate a 3D point cloud with global consistency. GNSS information was used to reduce search space in image matching and produce an accurate transformation matrix between the image scene and the global reference system. In addition, to enhance the effectiveness and efficiency of the co-registration of the two-source point clouds, an RPM-net was used. The proposed workflow processed the 3D laser point cloud and UAV low-altitude multi-view image data to generate a complete, accurate, high-resolution, and detailed 3D model. Experimental validation on a real high formwork project was carried out, and the result indicates that the generated 3D model has satisfactory accuracy with a registration error value of 5 cm. Model comparison between the TLS, image-based, data fusion 1 (using the common method), and data fusion 2 (using the proposed method) models were conducted in terms of completeness, geometrical accuracy, texture appearance, and appeal to professionals. The results denote that the generated 3D model has similar accuracy to the TLS model yet also provides a complete model with a photorealistic appearance that most professionals chose as their favorite.