A web scrapper with Python and Selenium for poly.google.com
I am very interested in game development especially for mobile platforms. Because of this I always need free 3D models. One of my favourite source was poly.google.com, google’s free 3d model service. Sadly the website will be killed this year and it will take an important free model resource from me. Before the website is shut down, I’ve decided to download as many 3D models as possible. So I started this project to automatically download 3d models from poly.google.com.
Selenium and Web Driver
Selenium is an open source tool which is used for test automation carried out on web browsers. Web drivers enable python scripts to control the browser. In order to control the browser from script, the web-driver must be installed. I used Google Chrome in this project so if you want to go my way you have to get Chrome’s web driver. Install instructions for both Selenium and Web driver can be found at:
Deciding Actions
Before we begin to code, we have to create a downloading algorithm.
- Search for a text.
- Filter found items for only 3d models.
- Scroll a couple times to load enough objects to screen.
- Download all founded objects one by one.
- Return to step 1
Interact with Web Page
With selenium library, you can interact with elements of a webpage by using id, xpath and class name of web elements. For implementing the algorithm, we need to find xpaths and ids of elements we want to interact with(searchbar, download button etc). You can do this by looking to pages source with a web browser.
After finding necessary ids, xpaths and class names. We can start coding scrapper.
Code
Project contains 3 script files: main.py, poly_scrapper.py and file_manager.py. Main file is starting point of app and responsible for app logic by help of PolyScrapper and FileManager classes.
FileManager class responsible from creating subfolders, cutting/pasting downloaded 3d models to correct subfolders and naming them correctly.
Lastly PolyScrapper class is responsible for interacting with the web browser. Clicking elements, searching text, filtering results etc.
Result
I downloaded nearly 40.000 models with this project in a couple week. I don’t have enough experience with Python and Selenium. So there is definitely room for improvements. But performance was enough for me so I stopped developing here. You can get everything from my Github:
https://github.com/CihadDogan/PolyDownloader
My linkedin page if you want to ask question;