How do I analyze content stored on Amazon S3?

 Uva Software's Scanii integrates easily with  Amazon Web Services's Simple Storage Service (S3) via a one-click integration using our SAM packaged scanii-lambda application

How it works

This is, essentially, a series of lambda functions packaged in a  one-click deployable application that configures everything needed so your S3 objects are submitted automatically to scanii’s content analysis API.  That includes a lambda function that submit files in S3 for processing by our API service and another function that receive callback events and takes appropriate action depending on findings being present or not. 

Currently,  you can choose from a couple of different actions:

1) Tag the S3 object, this is defaulted to on and adds the following AWS tags to objects processed: 

Tag name Tag purpose
ScaniiId  the resource id of the processed content
ScaniiFindings  list of identified findings (content engine dependent) 
ScaniiContentType  the identified content type of the file processed 

2) Delete the S3 object with findings - this is defaulted to off and will delete S3 objects with findings (such as malware or NSFW content) - for a full list of available content identification see https://support.scanii.com/article/20-content-detection-engines

Deploying

Step 1 - Deploying our S3 integration to your AWS account

Deploy our application to your account by clicking  here

For that you will need 3 things: 

  1. An active Amazon Web Services account 
  2. Your scanii API credentials, If you don't already have one, create a scanii.com account and API key (see https://support.scanii.com/article/21-managing-api-keys
  3. The name of the bucket you would like to monitor for events

Step 2 - Enable events from your S3 bucket to our integration 

Due to a quirk with the way that SAM applications work, our application cannot automatically wire itself into object creation events for your S3 bucket, so, the last step in setting up our integration is setting up those events. For that you need to: 

1) Log into the AWS Lambda Console and click on the function called  uvasoftware-scanii-lambda-submit (this is the function that submits content to scanii.com for processing. 

2) Under "Add triggers" select S3 and fill in the bucket information such as the bucket name and optional prefix path, the important thing is to leave under "Event type" Object Created (All) which will ensure that this lambda function is notified every time a new object is created. Lastly click on Add to add the event and Save to save your changes to the lambda function. 

That it, from this point on all files/objects added to that bucket will get automatically processed and tag or deleted 🎉

Advanced topics

Turning on deletion on findings

You turn on the deletion action, which will automatically delete object from S3 that have findings such as malware content, by setting the value to true for actionDeleteObjectOnFindings under application settings. This can be done either during deployment or via directly manipulating the environment for the  uvasoftware-scanii-lambda-callback lambda function. 

Upgrading the SAM application

To upgrade our SAM application, all you need to do is to trigger a redeploy and AWS will handle everything automatically

Why you see the "Creates custom IAM roles or resource policies" warning

You see that warning during deploying because SAM applications cannot update object tags using SAM default permission rules forcing us to provide a custom IAM Policy - this is a shortcoming of the SAM specification that we're working with AWS on improving. 

Still need help? Contact Us Contact Us