For some clients, the most efficient way to ingest large content libraries into DSP will be through CSV Ingest. This method uses CSV spreadsheets to populate a dashboard with videos, channels, and associated taxonomy. The spreadsheet must follow dotstudioPRO’s specifications and follow formatting rules, but the system is an efficient method to ingest multiple delivery components and automatically associate title taxonomy on a regular basis.
This article outlines the onboarding requirements, general function, and user experience of CSV ingestion for dotstudioPRO clients.
Overview
Who is it best for?
CSV ingest is best for clients who will be receiving content from multiple sources/content owners and are not capable of generating their own MRSS feeds. Clients who are developing their own content platform may wish to use CSV ingest. The infrastructure requirements for CSV ingest are somewhat steep, as content validation must be managed by the client, and the client must have access to their own Amazon S3 storage bucket.
Advantages
- Volume: CSV ingest allows multiple files to be delivered simultaneously.
- Automation: Once connected, ingest can be regularly scheduled to pull all newly added content.
- Management: CSV ingest works very well with Amazon Transfer Family. Users can have their partner studios deliver to individual subfolders in an S3 bucket, but ingest into the same dashboard.
- Time: Metadata, images, timed-text, and taxonomy can all be ingested, so videos arrive in the dotstudioPRO dashboard as complete packages.
Disadvantages
- Infrastructure: Clients are responsible for setting up their S3 infrastructure and managing how content partners deliver to their S3 network.
- Maintenance: Content and document validation, S3 network and permissions, and S3 workflows are all managed by the client and may require development to streamline.
How does it work?
DSP Clients will provide access to an S3 environment containing media files (videos, captions, images) and CSV metadata manifests which conform to dotstudioPRO’s specifications. DSP will authenticate into the client’s bucket and automatically ingest each new video and channel on a regular interval. Videos, playlists, and channels will automatically be created for each new line item in the CSV’s. Once CSV files are ingested, they are moved to either an archive/error folder.
Ingest Source
The location where media files and CSV manifests are stored. Currently only Amazon S3 buckets are supported as an ingest source.
Multiple Ingest Sources can be configured in the user's DSP Dashboard. An example use case would be a client who works with multiple content owners, and each content owner delivers to a different location in the client's S3 bucket.
CSV Manifests
CSV Manifest files are metadata spreadsheets, saved as .csv files (UTF-8), and can easily be created using a program like Microsoft Excel or Google Sheets. These documents list the titles that will be ingested into DSP, all of the metadata for those titles. They also tell our system where it can find all of the videos, images, and timed-text files that will be ingested.
Onboarding and Setup
To onboard for CSV ingest, clients will need to provide DSP access to an Amazon S3 bucket. At minimum, the bucket must include the following. These terms are defined later in this article.
- S3 credentials, and a path to store deliverables
- “CSV Manifests” to be ingested by DSP
- Media files to be ingested by DSP
DSP will need access to the client's S3 bucket. The DSP user credentials must have command line access and the ability to create and delete files and folders, including temporary URL's. Including the following information:
- Bucket Name
- Region
- Key
- Secret
-
The path where CSV's and media files will be stored
CSV ingest sources can be configured in the Ingest settings menu of the DSP Dashboard:
SETTINGS>INGEST & EXPORT> CSV INGEST
First navigate to the CSV Ingest Settings element, then click the EDIT button. Once in EDIT mode, click the ADD URL button, and configure the new Ingest Source:
Name: this is a user facing label for the ingest source. Handy for quickly identifying sources for dashboards that pull from multiple locations. If the "Use ingest source as Studio Name" toggle is enabled, this value will be set as the "Studio" value for all items ingested via CSV.
Type: currently only Amazon S3 is supported.
Region: set the AWS region where the source bucket is located.
Enabled Toggle: if enabled, the ingest source will be probed during ingest.
Autopublish Toggle: if enabled, channels ingested from the source will be set as published during ingest.
Key, Secret, Path, Bucket Name: config values associated with the source Amazon S3 bucket and user access.
Once configured, click the SAVE button. Multiple ingest sources can be saved for the dashboard.
Media Asset and CSV Manifest Upload
Files (videos, timed text, and images) are uploaded to the S3 path. Use of a cloud server browser is strongly recommended.
Important: Clients should ALWAYS fully upload all media files before adding the corresponding CSV, otherwise it could trigger partial ingest. When a document is ingested, but the media files have not been uploaded, DSP treats the document as though it contains errors.
The client is responsible for creating and managing all CSV manifest documents and uploading them to their Amazon S3 bucket. Clients will need provide a new CSV Manifest for each ingest.
Sample CSV templates can be output from the dashboard in the CSV Ingest Settings: SETTINGS/INGEST & EXPORTS/CSV INGEST/SAMPLE CSV TEMPLATE
In the sample template menu, users can download a CSV with sample data and field descriptions.
Ingestion
Ingest will automatically probe for all "Enabled" Ingest Sources every 24 hours. Users can also trigger an ingest manually.
Manual Ingest
- In the DSP Dashboard, navigate to the CSV Ingest Settings.
- Click the CSV Ingest Logs button.
- A new page will open, listing the CSV Ingest history, click the button labelled "Update List."
- The page will refresh, listing any documents it located, documents can be triggered individually, or will auto ingest every 2 minutes.
During ingest, dotstudioPRO looks at the client’s Ingest Sources, then notes all new CSV Manifests found within those source. New videos records are created in the dashboard then all files are ingested.
Dashboard Automations
When dotstudioPRO reads from a CSV Manifest, it uses the "taxonomy" field to determine what objects to create in the user's dashboard:
| Taxonomy | Description |
| Video | The data in this row will create a video. if it contains no "series name", "episode number", or "season number", a single channel will also be created and associated with the video. They will use the same metadata and images. |
| Season | The data in this row will create a child channel. Episodes are associated with the season using the "Parent ID" field on each video. |
| Series | The data in this row will create a parent channel. Seasons are associated with the series using the "Parent ID" field on each season. |
Channel Settings
New channels are created with the following settings:
- No assigned categories
-
If no wallpaper image has been supplied, one is automatically generated.
- For Movies (single channel): the video thumbnail is used
- For TV Seasons (child channel): the video thumbnail from the first episode is used
- For TV Seasons (parent channel): the video thumbnail from the first episode of the series is used
By default, channels will be created in an unpublished state, auto-publishing can be enabled in the Ingest and Export Settings menu of the user's dashboard.
Playlist Generation:
When a CSV Manifest contains more than one video that share a “Parent ID” value, it will automatically create a playlist using these videos. The videos will be programmed into a playlist in ascending order by episode number. The playlist's name will use the format “[Shared Series name} - [Shared Season Number]”.
Missing Series Data:
If episodic videos are provided, but no "Season" or "Series" is listed in the CSV Manifest, channels will be created using series title and the images from the first episode of the series.
New Episodes of an Existing Series:
When new episodes or seasons of an existing series are ingested, they are not automatically connected to playlists or channels. These assets will need to be manually associated in the dashboard. This is a safeguard to prevent trojan horses that may cause an asset to be published prematurely.
- New episodes of an existing season will not be added to any playlist.
- New seasons (child channel) of an existing series will not be associated with the Series (parent channel).
CSV Archiving
After ingestion, CSV Manifests are renamed and moved to one of two subfolders. These folders will be automatically created by the ingester in the client’s S3 during ingest. These folders are stored in the path specified for each Ingest Source. Files are renamed with a timestamp to denote when they were processed.
- Client’s File Name: MyMetadata.csv
- Renamed: MyMetadata-IngestedAt220616092617000.csv
_IngestArchive
Files in this subfolder were ingested successfully without errors. All titles were either already present on the dashboard or new records were created. Videos that were ingested to DSP, but failed the encode process are considered successful by the ingest system.
_IngestErrors
Files in this subfolder contained errors and records were unable to be generated for at least one title. If the document contains 100 titles, and 99 of them ingest successfully, but one fails, the whole document is moved to this subfolder.
Updating Titles & Redeliveries:
CSV ingest can only be used to create new records in DSP, NOT update old ones. When DSP reads from a CSV Manifest, it will only ingest new titles into the dashboard. Because of this, titles can be completely modified/edited in the dashboard without any fear of the data reverting to its original state.
If a title needs to be updated, it is usually faster and easier to make changes directly in the dotstudioPRO dashboard.
Occasionally, users may wish to run full/partial reingestion of one or more titles using CSV Ingest. To do so they will need to make changes in both the dashboard and S3.
|
Step |
Description |
Location |
|
1 |
Delete the video from the dashboard. This will remove its ID from the system so it is considered a new video on the next ingest cycle. |
Dashboard |
|
2 |
The user will then need to upload a CSV with their changes to the Manifest Subfolder. They can do so by creating a new document, or by editing and moving a document from either the Archive/Error folders back into the Manifest Subfolder. |
Amazon S3 |
CSV Ingest Logs
DSP users can review and manage all recent ingests through the CSV Ingest Logs. This view can be accessed directly by URL ([your dashboard]/admin/csv-ingest-audit) or through the Ingest & Export Settings.
This view displays a history of CSV Ingest logs for the user’s dashboard. Logs that contain 0 new records and 0 errors are purged after 90 days. All logs have a limited lifespan and are purged after 6 months.
- Update List Button: When clicked, this button triggers the “store-csv-record” lambda, then refreshes the page. While the lambda is running, the loading animation plays to prevent users from clicking around. On refresh, the list is updated with incomplete CSV Manifests.
- CSV Audit Table: This table lists all available CSV Ingest logs for the active dashboard. The table lists the location of each CSV Manifest, the status of each log, the number of total videos, the number of videos that were ingested, and the number of videos that contained errors. If a document is pending ingestion, its status value will be replaced with an “Ingest” button.
- Ingest Button: When clicked, this button triggers the “ingest-csv-record” lambda for the selected document, then refreshes the page. The loading animation will play while the document is being read.
- View Details Button: clicking this eye shaped icon will show the user a diagnostic report of the ingest, including a list of all successful and all errored row items.
CSV Ingest Audit Status Legend:
|
Status |
Meaning |
|
Success |
DSP was able to successfully read the entire document and no errors were detected. All new videos were ingested, and any videos extant to the client’s dashboard were skipped over.
After ingest, the document was moved to the “_IngestArchive” folder. |
|
Success with Errors |
DSP was able to successfully read the entire document, but some errors were detected. Row items with errors were not ingested. After ingest, the document was moved to the “_IngestErrors” folder. Details about the errors can be found by clicking the ”View Details” button. |
|
Ingest Button |
dotstudioPRO has not attempted to ingest this document. Clicking the button will trigger the document to be ingested. |
|
In Progress |
dotstudioPRO is currently ingesting from this document. |
|
Timeout Failure |
The document contains too much data and dotstudioPRO was unable to read it entirely. Some items were not ingested. The document has not been moved and will be reread on subsequent ingests. |
|
Failure |
dotstudioPRO was unable to locate or read from the document, or every item in the document contained errors. No items were ingested. After ingest, the document was moved to the “_IngestErrors” folder. |
The View Details Page:
By clicking the “VIEW DETAILS” (eye icon) button beside any completed log, users can review a list of assets that were ingested to dotstudioPRO. The report lists the name of the document, the date it was accessed, and breaks reports ingest data by taxonomy level (video, parent channel, child channel, single channel). Logs can be emailed to the active user by clicking the EMAIL button.
Troubleshooting and Limitations
CSV ingest is designed as a starting point for curation and bulk delivery into a DSP dashboard. It cannot be used to update or modify existing metadata or media. If a title or group of titles partially ingest, corrections must be made within the dashboard directly; alternatively, the title can be deleted from the dashboard, and re-ingested from an updated CSV. The video will incur additional encoding costs.
Encoding Errors:
Occasionally, a video will ingest properly but fail to encode once it has arrived in the dashboard. This could be a problem with the source video or just an internal issue with the encoder. When a CSV Ingest video fails to encode, it needs to be retriggered by following the steps above.
Metadata and Ingest Errors:
Metadata and media assets are pulled directly from the spreadsheets, so if any metadata or media are missing, that information was likely missing or formatted incorrectly in the source CSV file. DSP does not check for spelling, formatting, or other syntax errors. If metadata is inaccurate, DSP will inherit the bad data. It is important for users to fully understand the specifications for CSV Manifests.
When a video or channel does not generate at all in DSP. It is likely because its row contained an error in the CSV Manifest. If a row in the CSV Manifest is missing a mandatory field, that row to be skipped on ingest, the document is then considered to contain an error.
Errored CSV’s are noted in the CSV-Ingest-Audit tool. On ingest, they are moved to the client’s _IngestErrors folder.
Video Failure Conditions:
From time to time a user may deliver a non-functional video (failed video). This issue can be caused by a number of issues:
- A URL was not provided in Manifest CSV (title will not be ingested)
- An invalid URL was provided in Manifest CSV (protected video, or wrong location)
- The video fails to encode properly
If a video is file should fail encoding, the dashboard will still ingest the video metadata, ancillary deliverables, and established taxonomy, with the following exceptions:
- The failed video will have its status set as though it had failed to encode.
- The CMS on the row for that video will change to clearly identify failed videos so that clients may troubleshoot
Outages and Downtime:
CSV ingest functionality is tied to the DSP dashboard. If for any reason the dashboard goes down, CSV ingest will be inaccessible until the dashboard is restored.
Updated