Use Case 06: Data Analysis Example with Google Colab
Use Case: Sea‑Cargo Traffic Analysis with AI (Google Colab)
Analyze Baltic Sea cargo vessel traffic using ChatGPT and Google Colab—clean data, detect anomalies, and build clear visualizations.
Overview
This use case demonstrates how to analyze sea‑cargo traffic with ChatGPT and Google Colab, including data cleanup, anomaly detection, and visualization. The dataset covers arrivals to Baltic Sea ports over a 12‑month period.
Dataset Details
- Countries: Sweden, Finland, Estonia, Latvia, Lithuania, Poland
- Period: 2021‑01‑01 00:00 UTC → 2022‑04‑31 23:59 UTC
- Vessel types: All cargo and all tankers; Ship size: length ≥ 65 m
- Events: 87,624 arrival records
- Fields: Port ID, Port name, LOCODE, MMSI, IMO, vessel name, destination, vessel type, arrival/departure timestamps
Step‑by‑Step Guide
Prompt example: “You are a maritime cargo shipping expert. Summarize this dataset for me. What trends or anomalies do you see?”
AI findings: Identified Short Sea Shipping patterns, regional trade and bulk logistics corridors; stable contract‑based flows; and data inconsistencies in destination fields.
Issues observed: Same port expressed differently (e.g., “LVVNT”, “LV VNT”, “VENTSPILS”); multi‑port routes like “SE STO FI SKV”; and unrealistically short dwell times.
Prompt example: “Harmonize arrival and departure destination fields.”
Normalize to UN/LOCODE, remove spaces (e.g., SE GOT → SEGOT), and retain raw columns for traceability.
Prompts used: (1) Create a script that corrects destinations in vesselDestinationArrival and vesselDestinationDeparture to UN/LOCODE; (2) Build a Python dictionary mapping portName → portLocode by scanning the CSV.
Colab workflow: Upload CSV to /content/ais/, set variables like INPUT_CSV="/content/ais/PRJ896.csv" and OUTPUT_PY="/content/ais/port_name_to_unlocode.py", then run the generated scripts.
Result: ~124 unique port mappings created; dataset cleaned and destinations standardized.
Prompt example: “Propose Python code for time‑based anomaly filtering.”
The script calculates port dwell time and flags anomalies via a function such as apply_time_anomaly_flags(), plus a usage snippet to apply and test the logic in Colab.
Prompt example: “Provide a Python script to visualize port_dwell_hours and time_anomaly_flag.”
Outputs included a reusable Colab‑ready script and adaptations to plot distributions, counts, anomaly categories, and port‑level comparisons.
Cargo‑Traffic Anomalies Visualization
The workflow produces multiple charts, including: (1) Distribution of Port Dwell Time, (2) Time‑Based Anomaly Counts, (3) Port Dwell Time by Anomaly Type, (4) Dwell Time by Port Name , and (5) Port Dwell Time Distribution by Port.

