What is Amazon S3?
Amazon S3 (Simple Storage Service) is a cloud-based object storage service offered by Amazon Web Services (AWS). It allows users to store, retrieve, and manage large amounts of data, such as files, images, videos, and backups, in a highly scalable and reliable manner. S3 is designed to provide high durability and availability, making it ideal for storing critical data and serving as a foundation for various types of cloud-based applications and services.
Why Do We Support Amazon S3?
We see an increasing demand for scalable and secure cloud storage solutions to house big data. Amazon S3 is one of the most popular options used by both individuals and enterprises. By building a direct connection between Nexus and Amazon S3, we enable automatic and fuss-free asset uploading. This helps users avoid the hassle of downloading the S3 bucket assets to a local storage and manually uploading them to Nexus, and also ensures that any updates made to the S3 bucket are automatically synced with Nexus.
Is Your Data Safe?
We do not hold any of your actual image data on Nexus. Rather, we are reading in the image metadata from your bucket and loading that information on our platform. Additionally, the access is read-only. What this means is that your S3 Bucket is essentially the master dataset. Changes to your image dataset on our platform will not be reflected in your bucket. If you have made changes in your bucket and sync on our platform, the most recent changes will be reflected.
Furthermore, Amazon S3 provides a number of security features to help protect stored data, including:
- Access controls: S3 allows users to set permissions on individual objects and buckets, which can be used to limit access to specific users or groups.
- Data encryption: S3 supports both server-side and client-side encryption, which allows users to encrypt their data at rest and in transit.
- Identity and Access Management (IAM) policies: IAM allows users to create and manage users, groups, and permissions, which can be used to control access to S3 resources. We will be utilizing this feature when syncing assets with Nexus.
- Amazon Virtual Private Cloud (VPC) integration: S3 can be integrated with VPC, which allows users to create a logically-isolated section of the AWS cloud where they can launch AWS resources in a virtual network that they've defined.
However, it is important to note that S3's security features are only as effective as the configurations set by the user. It will be the user's responsibility to properly configure these features and to ensure that access controls, encryption, and other security measures are in place and working correctly. Additionally, users should also consider security best practices when using S3, such as regularly reviewing access logs, monitoring for unauthorized access attempts, and regularly auditing the security configuration of their S3 resources.
How Does Your S3 Bucket Sync Assets Onto Nexus?
Starting in your chosen project page, select the Assets tab on the sidebar and select the option Connect Amazon S3 Bucket at the top. You can then select the Begin Setup button to start the process. Also, please be logged in to your AWS account so that you have all your AWS S3 information.
1. Bucket Details
There are four items in this section:
- Connection Name is an identifier for the connection between your Nexus project and your S3 bucket, and can be whatever you want it to be named.
- AWS Bucket Name is the name of your S3 bucket in AWS, which should follow AWS' naming standards, such as no usage of special characters.
- Folder Prefix is an optional entry allowing you to choose specific subfolders in your S3 Bucket for integration. This is for you to restrict Datature's access to only the folders that you want it to read. If left empty, Nexus will just use the data in the root folder of the bucket. Ensure that the overall folder path is to a folder filled with the images that you want to be read.
- AWS Bucket Region is the region in which your bucket is stored, which can be checked for on the AWS website.
2. AWS Policy
In this section, Datature generates two JSON files that you must copy into your AWS account in order. The first JSON is the IAM Policy, and the second is the IAM Role.
Head over to the AWS website under your account and go to IAM.
In the IAM Dashboard, select IAM Policies on the sidebar and select the Create policy button near the top right. Select the JSON tab and paste the IAM Policy JSON generated by Nexus, replacing whatever was in the JSON editor. You can skip adding tags as they are not necessary for functionality.
In the Review policy section, simply input a policy name that you can remember. The other text fields are optional and are not necessary for functionality.
Once you have created the policy, you should see it in the list of policies. Next, select Roles on the sidebar and select the Create role button near the top right.
Under Trusted Entity Type, select Custom trust policy. Similarly, take the IAM Role JSON generated by Nexus and replace all the contents in the JSON editor.
After selecting Next, select the policy that you created previously under Add permissions.
Finally, provide a role name in the next page. No other text fields need to be filled for functionality.
Once you have created the new role, check in the Roles page to see that your new role is there.
3. Bucket Connection
For this section, you will need your AWS Role ARN. This can be found by going to the Roles page, selecting your newly made role, and copying the text under ARN in the Summary section.
You can now complete the bucket connection. If it is successful, you will see a green heart with text saying that the connection was successful. If not, you will see a broken heart.
4. Sync Assets
Now that your S3 bucket is connected to Nexus, you can now choose whether you want to Sync Now or Sync Later. Note that you can always sync at any time after the connection has been made in Step 3. If you choose Sync Now, Nexus will begin to sync your image metadata from the bucket onto the platform. Once the sync has completed, refresh your Assets page to see your assets loaded in from the bucket!
Once your assets have been successfully uploaded to your Nexus project, you can begin creating annotations using our in-house annotation tool suite. You can consider using our Intelligent Tools such as Intellibrush and AI Edge Refinement for more precise annotations.
If you think that annotating large quantities of data is too much of a hassle, we offer Model-Assisted Labelling to streamline your MLOps pipeline by iterating upon previously trained models to assist in data annotation for model retraining.
Our Developer’s Roadmap
Additionally, we have roadmaps in place to expand asset syncing capabilities with other available cloud storages such as Azure. This will allow users and enterprises that have existing data stores to conveniently utilise Nexus to train models to optimise their operations pipelines.
Want to Get Started?
If you have questions, feel free to join our Community Slack to post your questions or contact us about how asset uploading via Amazon S3 fits in with your usage.
For more detailed information about the S3 bucket connectivity, customization options, or answers to any common questions you might have, read more on our Developer Portal.
Build models with the best tools.
develop ml models in minutes with datature