#data #bizdev_utils ## Overview This contains the instructions for rapidly building an app to explore Places of Interest (PoI) data for a given location. PoI data is a great 'source of truth' for business development purposes as well as for enriching already existing data with standardized addresses, domains, numbers, etc. ## Inputs - OpenAI key (optional, but makes it easier to look up initial coordinates) ## Process - Setup virtual environment - Install dependencies - Infer coordinates for the location of interest - Build DB - Parse key information in the DB - Open the app in Datasette (optional) ## Outputs - Parsed PoI database for a given location - Datasette app to explore and visualize the data (optional) ## Execution Create venv ```sh virtualenv venv ``` Activate venv ```sh source venv/bin/activate ``` Pip install requirements ```sh pip install -r requirements.txt ``` Set LLM OpenAI key (optional, but helpful) ```sh llm keys set openai ``` Prompt and build DB ls (can take a few mins if looking up a large area) ```sh overturemaps download --bbox=$(llm 'Give me a bounding box for Stanford, Kentucky expressed as only four numbers delineated by commas, with no spaces, longitude preceding latitude.') -f geojsonseq --type=place | geojson-to-sqlite stanford.db places - --nl --pk=id ``` Another example... ```sh overturemaps download --bbox=$(llm 'Give me a bounding box for Phoenix, Arizona, to include 20 miles outside of the city in all directions, expressed as only four numbers delineated by commas, with no spaces, longitude preceding latitude.') -f geojsonseq --type=place | geojson-to-sqlite phx.db places - --nl --pk=id ``` Open DB and enrich data w this command (optional, but helpful) ```sql CREATE TABLE places_enriched AS SELECT *, CAST(json_extract(geometry, '$.coordinates[0]') AS FLOAT) AS longitude, CAST(json_extract(geometry, '$.coordinates[1]') AS FLOAT) AS latitude, json_extract(names, '$.primary') AS primary_name, json_extract(categories, '$.main') AS category_main, REPLACE(REPLACE(REPLACE(REPLACE(json_extract(categories, '$.alternate'),'"',''),'[',''),']',''),',',' | ') AS category_alternates, json_extract(REPLACE(REPLACE(addresses,'[',''),']',''), '$.freeform') AS street_address, json_extract(REPLACE(REPLACE(addresses,'[',''),']',''), '$.locality') AS city, json_extract(REPLACE(REPLACE(addresses,'[',''),']',''), '$.region') AS state, json_extract(REPLACE(REPLACE(addresses,'[',''),']',''), '$.postcode') AS zip, json_extract(phones, '$[0]') AS primary_phone, json_extract(websites, '$[0]') AS primary_website, json_extract(socials, '$[0]') AS primary_socials FROM places; ``` ## Requirements These are in the env wherein this script is run. Some of these may not be needed but if you are just being lazy you could pip install -r requirements.txt this file and you'd be sure to be up and running with this. ```txt aiofiles==23.2.1 annotated-types==0.6.0 anyio==4.3.0 asgi-csrf==0.9 asgiref==3.8.1 certifi==2024.2.2 click==8.1.7 click-default-group==1.2.4 datasette==0.64.6 datasette-cluster-map==0.18.1 datasette-geojson==0.4.0 datasette-leaflet==0.2.2 distro==1.9.0 geojson==3.1.0 geojson-to-sqlite==1.1.1 h11==0.14.0 httpcore==1.0.5 httpx==0.27.0 hupper==1.12.1 idna==3.7 itsdangerous==2.2.0 janus==1.0.0 Jinja2==3.1.4 llm==0.13.1 MarkupSafe==2.1.5 mergedeep==1.3.4 numpy==1.26.4 openai==1.25.2 overturemaps==0.4.0 Pint==0.23 pluggy==1.5.0 pyarrow==15.0.2 pydantic==2.7.1 pydantic_core==2.18.2 python-dateutil==2.9.0.post0 python-multipart==0.0.9 python-ulid==2.5.0 PyYAML==6.0.1 shapely==2.0.4 six==1.16.0 sniffio==1.3.1 sqlite-fts4==1.0.3 sqlite-migrate==0.1b0 sqlite-utils==3.36 tabulate==0.9.0 tqdm==4.66.4 typing_extensions==4.11.0 uvicorn==0.29.0 ```