What is the maximum size of all attachments uploaded to Azure Devops? - azure-devops

I'm using the Azure Devops (old VSTS) REST Api to create work items and add attachments to them. However I don't see the maximum size limit of all attachments together, if there's one.
I'm concerned because the files have a maximum of 1 MB alone but the overall size can get big really fast.
So far I'm using this piece of documentation which states the size of each attachment but not of all of them.

What is the maximum size of all attachments uploaded to Azure Devops?
AFAIK, Azure devops only has a limit on a attachment. There is no limit to the size of all attachments. Basically, it is legitimate use is basically unlimited.
It’s based on the Blob store and Meta-data store (SQL azure database) and they are increasing. Either way it’s a lot of data and no one is close to hitting any limits.
You can check the blog How much data can you put on VSOnline? for some more details:
1) Blob store – this is the size of the files, attachments, etc that
are stored on the service. The files are compressed so that affects
the size. The blob store is, for all intents and purposes unlimited
(though we may from time to time impose limits to prevent abuse).
Legitimate use is basically unlimited.
Besides, Microsoft limit activity by the resources consumed rather than a straight filesize limit, check the details:
Rate limits
Hope this helps.

Related

Used space displayed for my Google disk is 4 times more than total size of the stored files

As it follows from the notification displayed, I have used almost all available Google disk space (96%) while the total size of the files are 3.5 Gb only. Additional 1Gb was deleted and stored in the bin. What is the reason an how can I fix it? Also I have a lot of files shared with me from other accounts. But regarding Google Disk documentation they should not be taken into account. Additionally I have 0.8GB in Gmail and no files in Google photo
Go to this link and evaluate what are the files that are consuming more storage
Delete the files in your bin, as they are still counting towards your quota
After you delete the files, it usually takes some time to update the space on your Drive. A propagation matter.
Make sure you don't have a lot of photos taking quota out of your account

Google storage operations extremely slow when using Customer Managed Encryption Key

We're planning on switching from Google managed keys to our own keys (working with sensitive medical data) but are struggling with the performance degradation when we turn on CMEK. We move many big files around storage in our application (5-200GB files), both with the Java Storage API and gsutil. The former stops working on even 2GB size files (times out, and when timeouts are raised silently does not copy the files), and the latter just takes about 100x longer.
Any insights into this behaviour?
When using CMEK, you are actually using an additional layer of encryption on top of Google-managed encryption keys and not replacing them. As for gsutil, if your moving process involves including the objects’ hashes then gsutil will perform an additional operation per object, this might explain why moving the big files is taking much longer than usual.
As a workaround, you may instead use resumable uploads. This type of upload works best with large files since it includes the option of uploading files in multiple chunks which allows you to resume an operation even if the flow of data is interrupted.

Need advice: How to share a potentially large report to remote users?

I am asking for advice on possibly better solutions for the part of the project I'm working on. I'll first give some background and then my current thoughts.
Background
Our clients can use my company's products to generate potentially large data sets for use in their industry. When the data sets are generated, the clients will file a processing request to us.
We want to send the clients a summary email which contains some statistical charts as well as sampling points from the data sets so they can do some initial quality control work. If the data sets are of bad quality, they don't need to file any request.
One problem is that the charts and sampling points can be potentially too large to be sent in an email. The charts and the sampling points we want to include in the emails are pictures. Although we can use low-quality format such as JPEG to save space, we cannot control how many data sets would be included in the summary email, so the total size could still exceed the normal email size limit.
In terms of technologies, we are mainly developing in Python on Ubuntu 14.04.
Goals of the Solution
In general, we want to present a report-like thing to the clients to do some initial QA. The report may contains external links but does not need to be very interactive. In other words, a static report should be fine.
We want to reduce the steps or things that our clients must do to read the report. For example, if the report can be just an email, the user only needs to 1). log in and 2). open the email. If they use a client software, they may skip 1). and just open and begin to read.
We also want to minimize the burden of maintaining extra user accounts for both us and our clients. For example, if the solution requires us to register a new user account, this solution is, although still acceptable, not ranked very high.
Security is important because our clients don't want their reports to be read by unauthorized third parties.
We want the process automated. We want the solution to provide programming interface so that we can automate the report sending/sharing process.
Performance is NOT a critical issue. Our user base is not large. I think at most in hundreds. They also don't generate data that frequently, at most once a week. We don't need real-time response. Even a delay of a few hours is still acceptable.
My Current Thoughts of Solution
Possible solution #1: In-house web service. I can set up a server machine and develop our own web service. We put the report into our database and the clients can then query via the Internet.
Possible solution #2: Amazon Web Service. AWS is quite mature but I'm not sure if they could be expensive because so far we just wanna share a report with our remote clients which doesn't look like a big deal to use AWS.
Possible solution #3: Google Drive. I know Google Drive provides API to do uploading and sharing programmatically, but I think we need to register a dedicated Google account to use that.
Any better solutions??
You could possibly use AWS S3 and Cloudfront. Files can easily be loaded into S3 using the AWS SDK's and API. You can then use the API to generate secure links to the files that can only be opened for a specific time and optionally from a specific IP.
Files on S3 can also be automatically cleaned up after a specific time if needed using lifecycle rules.
Storage and transfer prices are fairly cheap with AWS and remember that the S3 storage cost indicated is by the month so if you only have an object loaded for a few days then you only pay for a few days.
S3: http://aws.amazon.com/s3/pricing
Cloudfront: https://aws.amazon.com/cloudfront/pricing/
Here's a list of the SDK's for AWS:
https://aws.amazon.com/tools/#sdk
Or you can use their command line tools for Windows batch or powershell scripting:
https://aws.amazon.com/tools/#cli
Here's some info on how the private content urls are created:
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/PrivateContent.html
I will suggest to built this service using mix of your #1 and #2 options. You can do the processing and for transferring the data leverage AWS S3 which is quiet cheap.
Example: 100GB costs like approx $3.
Also AWS S3 will be beneficial as you are covered for any disaster on your local environment your data will be safe in S3.
For security you can leverage data encryption and signed URLS in AWS S3.

GCS: Is there a request limit for accessing objects?

I just heard that there is a limitation on the Google Cloud Storage, so that you can only access it with a request once per second. I searched through the internet, but didn't find any appropriate answer to this.
Is this right, or can i access it more then once per second? Just want to know for an webapplication i write at the moment, that can up- and download images on the Storage. If there is an limitation, it would cause some delay, if more requests per second are send from different users.
You may be referring to the limitation that you can update or overwrite the same object up to once per second. There's no limit to the number of times you can update across different objects, or to the number of reads you can do to any object.
https://cloud.google.com/storage/docs/concepts-techniques#object-updates

Amazon S3 + CloudFront Queries

I am currently making a social sharing like app and I encounter a problem.
First off, S3 in my experience is slow, so I need to sync the data for multiple servers around the world to make it faster for multiple users.
So my question is, I need to create multiple buckets for each country right? Amazon has a list of their server locations. So for each user, I calculate the nearest server than upload there? How?
Next question, in my app people can subscribe to others and check for their updates. So realistically, this would not create a speed difference. If someone in Singapore uploaded a piece of text and has a subscriber in United States, it wouldn't be any quicker for this subscriber because he has to download a piece of text stored all the way in the Singapore.
All of this is making me confused! I personally find S3 very slow, which is why I am using CloudFront.
Any help? Am I misunderstanding the process? Thanks!
Buckets are not per country, they are per region (EU, US, Asia, etc.)
Secondly, you do not have to manage closest URL to your S3 buckets, that's what CloudFront is for, you just get a single URL for each bucket and CloudFront will manage routing the user's request to the closest edge location.
PS: In addition, Amazon replicates data uploaded to your bucket across all edge locations transparently.
Amazon in no way "automatically" replicates your content out to the edge locations. Instead, your content is copied to a single edge location, if (and only) if the content is not there (could be the first pull, could be it's expired) when a user tries to access it from that edge. It is a pull mechanism, not a push. See "Download Distributions for HTTP Delivery" section of http://aws.amazon.com/cloudfront/