AWS Serverless Data Lake Jumpstart > Enriching Data > Catalog transformed data

Catalog transformed data

Go to the AWS Glue Console.
In the left navigation menu, click Crawlers.
On the Crawlers page, click Create crawler.
Specify nyc-yellow-tripdata-parquet-crawler as the crawler name, click Next.
On the Choose data sources and classifiers screen, specify the following information, and then click Next.
- Click Add a data source
- Choose Data source – S3
- Select Location of S3 data - In this account
- Include S3 path – s3://serverlessanalytics-[your-account-id]-transformed/nyc-taxi/yellow-tripdata
- For Subsequent crawler runs, select to Crawl all sub-folders
- Then click Add an S3 data source.
On the Configure security settings, choose ServerlessAnalyticsRole from the Existing IAM role, click Next.
On the Set output and scheduling screen, choose nyctaxi_db as the database.
On the Crawler schedule, leave the frequency On demand, click Next.
Review the crawler details, click Create crawler.
On the Crawlers page, select nyc-yellow-tripdata-parquet-crawler, and then click Run crawler.