How do I analyze content stored on Amazon S3?
How it works
This is, essentially, a series of lambda functions packaged in a one-click deployable application that configures everything needed so your S3 objects are submitted automatically to scanii’s content analysis API. That includes a lambda function that submit files in S3 for processing by our API service and another function that receive callback events and takes appropriate action depending on findings being present or not.
Currently, you can choose from a couple of different actions:
1) Tag the S3 object, this is defaulted to on and adds the following AWS tags to objects processed:
|Tag name||Tag purpose|
|ScaniiId||the resource id of the processed content|
|ScaniiFindings||list of identified findings (content engine dependent)|
|ScaniiContentType||the identified content type of the file processed|
2) Delete the S3 object with findings - this is defaulted to off and will delete S3 objects with findings (such as malware or NSFW content) - for a full list of available content identification see https://support.scanii.com/article/20-content-detection-engines
Step 1 - Deploying our S3 integration to your AWS account
Deploy our application to your account by clicking here.
For that you will need 3 things:
- An active Amazon Web Services account
- Your scanii API credentials, If you don't already have one, create a scanii.com account and API key (see https://support.scanii.com/article/21-managing-api-keys)
- The name of the bucket you would like to monitor for events
Step 2 - Enable events from your S3 bucket to our integration
Due to a quirk with the way that SAM applications work, our application cannot automatically wire itself into object creation events for your S3 bucket, so, the last step in setting up our integration is setting up those events. For that you need to:
1) Log into the AWS Lambda Console and click on the function called uvasoftware-scanii-lambda-submit (this is the function that submits content to scanii.com for processing.
2) Under "Add triggers" select S3 and fill in the bucket information such as the bucket name and optional prefix path, the important thing is to leave under "Event type" Object Created (All) which will ensure that this lambda function is notified every time a new object is created. Lastly click on Add to add the event and Save to save your changes to the lambda function.
That it, from this point on all files/objects added to that bucket will get automatically processed and tag or deleted 🎉
Turning on deletion on findings
You turn on the deletion action, which will automatically delete object from S3 that have findings such as malware content, by setting the value to true for actionDeleteObjectOnFindings under application settings. This can be done either during deployment or via directly manipulating the environment for the uvasoftware-scanii-lambda-callback lambda function.
Upgrading the SAM application
To upgrade our SAM application, all you need to do is to trigger a redeploy and AWS will handle everything automatically
Why you see the "Creates custom IAM roles or resource policies" warning
You see that warning during deploying because SAM applications cannot update object tags using SAM default permission rules forcing us to provide a custom IAM Policy - this is a shortcoming of the SAM specification that we're working with AWS on improving.