PDA

View Full Version : File Uploads. Storage Space and Zipped Files. How do you figure out the size ratio?



Harold Mansfield
11-06-2012, 12:05 PM
I'm trying to figure out how much I can expect to save in storage by requiring zipped files for upload.
How exactly do zipped files work and what are the space savings?

For instance, if I have 7 files that are all 20mb each. That's 140mb. If they are zipped into one file, what will the size be? Is there a formula or ratio that I can use to determine that?

MyITGuy
11-06-2012, 03:18 PM
How exactly do zipped files work and what are the space savings?
HowStuffWorks "How File Compression Works" (http://www.howstuffworks.com/file-compression.htm)



I'm trying to figure out how much I can expect to save in storage by requiring zipped files for upload.

For instance, if I have 7 files that are all 20mb each. That's 140mb. If they are zipped into one file, what will the size be? Is there a formula or ratio that I can use to determine that?

Unfortunately there are too many factors to give you a definitive answer (Program used, file(s) being compressed, compression algorithm and performance selected). As an example, compressing Mp3's and Excel Documents has been useless in almost every instance I've done as these files are already compressed to begin with, where a text or CSV file can yield significant results (I.E. 60Mb files reduced to less than a Mb).

vangogh
11-06-2012, 04:35 PM
Yeah, I didn't think there was an exact ratio of file savings going on. Just simple observation of the files and folders I've zipped over the years and there being little discernible pattern in how much compression goes on.

Harold is the main problem just trying to save space somewhere? Is the space local? On a server? Maybe we could offer some ideas if we knew more about where and why you were trying to save space.

Harold Mansfield
11-07-2012, 12:43 PM
Harold is the main problem just trying to save space somewhere? Is the space local? On a server? Maybe we could offer some ideas if we knew more about where and why you were trying to save space.

Yes actually. I'm trying to figure out how much space I'll need to host a site site where users will be uploading their files for buyers to download. It will all be digital...PDF's, Documents, CAD, maybe software downloads. But it will not video or music.

I thought about having sellers host their own files, and merely provide the download link, but that opens up too many variables and I'll have no quality control over what people are linking to.

How much space and how many uses can I get on a dedicated server with 130 GB of available storage?
Should I require everything to be zipped?
Should I set a storage limit? Say max 5mb? WordPress, for instance, is a pretty large zipped file with over 1000 files and it's only 5mb zipped.

I have no idea how big the largest file could be, but if I use WP as a standard, at 5mb per upload can I really get over 10k user uploads on 50G of storage?

MyITGuy
11-07-2012, 01:49 PM
Yes actually. I'm trying to figure out how much space I'll need to host a site site where users will be uploading their files for buyers to download. It will all be digital...PDF's, Documents, CAD, maybe software downloads. But it will not video or music.

How much space and how many uses can I get on a dedicated server with 130 GB of available storage?


130GB wouldn't be much for what is essentially a file storage or electronic downloads site...I'm basing this on the fact that your sellers will only upload the files once, but could remain on your server for years which increases your required storage.



Should I require everything to be zipped?

I wouldn't require it of your sellers, but would instead implement it as a part of the upload process to do this automatically on all uploads.



Should I set a storage limit? Say max 5mb? WordPress, for instance, is a pretty large zipped file with over 1000 files and it's only 5mb zipped.

I have no idea how big the largest file could be, but if I use WP as a standard, at 5mb per upload can I really get over 10k user uploads on 50G of storage?
Are you making money/commission off the sellers ability to upload/sell files? If so I would say no to the limits, otherwise I would provide a cap...but 5MB may be too small and you may want to monitor/review uploads that are not restricted for a period of time before implementing this cap.

Your example of WordPress being less than 5Mb is expected as it consists of text files which are easily compressed. PDF's, certain images and etc may not yield the same.

Since I'm assuming that this is a venture your currently exploring, it might make more sense for you to keep distribution/load balancing in mind. Instead of trying to calculate the hypothetical maximum that you would need, build your application to balance your requirements and handle the space issue automatically.

Harold Mansfield
11-07-2012, 03:28 PM
130GB wouldn't be much for what is essentially a file storage or electronic downloads site...I'm basing this on the fact that your sellers will only upload the files once, but could remain on your server for years which increases your required storage.[?QUOTE]
I'm just trying to get past stage one, which is to get it up and running and get transactions flowing through it. If successful, I'm positive I'll have the money to add more resources.




[QUOTE=MyITGuy;68695] it might make more sense for you to keep distribution/load balancing in mind. Instead of trying to calculate the hypothetical maximum that you would need, build your application to balance your requirements and handle the space issue automatically.

So that brings me back to needing a programmer and building this thing from scratch.

nealrm
11-07-2012, 04:18 PM
Given how cheap bandwidth and storage is, are you being penny wise and pound foolish. Yes you can save some money on band width and storage by requiring zipped files, but if it reduces the ease of using the site is it worth it? A few customers lost could easily eat up the savings.

MyITGuy
11-07-2012, 04:25 PM
I'm just trying to get past stage one, which is to get it up and running and get transactions flowing through it. If successful, I'm positive I'll have the money to add more resources.
Understood, just didn't want you to start off on the wrong foot. In this case I'd say look for a provider who has the type of plans you may see yourself growing into, or who may have the additional resources you can take advantage off. As an example, look for a shared hosting company who can start you off with 50-100GB of space, but has the ability to transition/migrate you into a VPS or Dedicated Server down the road.


So that brings me back to needing a programmer and building this thing from scratch.

Is this a part of the e-commerce thread you have? If so, that sounds like the best route so you get exactly what you want from this solution. However if there is a package that may fit your needs, there might be a possibility to integrate AWS (Amazon Storage) or something else to make this a non-issue.

I.E. It looks like you may be able to mount your AWS storage space as an iSCSI volume and mount it as it were a physical device. /home/username/public_html serves up your webpages from your shared/vps/dedicated server, while /home/username/public_html/uploads could be your mounted iSCSI device that actually connects to AWS for storage purposes.

MyITGuy
11-07-2012, 04:31 PM
Given how cheap bandwidth and storage is, are you being penny wise and pound foolish. Yes you can save some money on band width and storage by requiring zipped files, but if it reduces the ease of using the site is it worth it? A few customers lost could easily eat up the savings.

Asking the questions up front doesn't hurt. Why throw $250+/mo away on a single dedicated server with plenty of storage and bandwidth if you can accomplish the same with pooled shared hosting/vps accounts or other solution?

Harold Mansfield
11-07-2012, 05:21 PM
Given how cheap bandwidth and storage is, are you being penny wise and pound foolish. Yes you can save some money on band width and storage by requiring zipped files, but if it reduces the ease of using the site is it worth it? A few customers lost could easily eat up the savings.

Of course I want to be smart, but I'm more concerned about maxing out my storage space early. There has to be a cut off point. I certainly don't want people trying to upload the Library of Congress.



Asking the questions up front doesn't hurt. Why throw $250+/mo away on a single dedicated server with plenty of storage and bandwidth if you can accomplish the same with pooled shared hosting/vps accounts or other solution?
Exactly. And I'm hoping to use the existing Dedicated Server plan that I already have. I got 100GB of free space, I'm thinking that should be enough to hold a decent catelog of items. If that get's filled, then I know I have something.



I.E. It looks like you may be able to mount your AWS storage space as an iSCSI volume and mount it as it were a physical device. /home/username/public_html serves up your webpages from your shared/vps/dedicated server, while /home/username/public_html/uploads could be your mounted iSCSI device that actually connects to AWS for storage purposes.
I almost understand that.

nealrm
11-07-2012, 06:02 PM
Just so that you have a point of reference, Storage is running $0.10 per GB and data download (band width) is running $0.18 per GB. Like yourself, I was concerned about storage and bandwidth cost when I moved to a new server. My site is big (just short of 700 GB) and uses a decent amount of bandwidth each month. But even at those levels, I am still way more concerned about site performance than both of those combined.

MyITGuy
11-07-2012, 06:05 PM
Exactly. And I'm hoping to use the existing Dedicated Server plan that I already have. I got 100GB of free space, I'm thinking that should be enough to hold a decent catelog of items. If that get's filled, then I know I have something.

Out of curiosity, would you mind sharing the specs of the server and what your current rate is? Depending on your current specs, you may be able to get a new server with better specs for the same, if not better pricing which might make the space issue a moot point...or at least give you more space for the short term.


I almost understand that.

Think of it as a USB Drive, that you access via a shortcut on your desktop. It looks/feels the same to you and you can access the files as you normally would...but you wouldn't know that they were on your USB Drive unless you really dived into it.

With Amazon AWS and other storage providers out there, they provide a method for you to access their storage infrastructure from your server, without the user/applications needing any special configurations. Most of the configurations will allow you to use iSCSI (Common Network Protocol that presents a block storage device(s) to your computer over a network). Once configured and setup, the computer thinks the iSCSI attached devices are a regular hard drive. Once this is done, you can create shortcuts symbolic links (I.E. Shortcuts) to redirect a path to this device which means the application thinks its accessing /home/username/public_html/uploads on your physical server, but it is actually being pulled from your vendors storage infrastructure.

MyITGuy
11-07-2012, 06:14 PM
Heres some specific apps that use AWS that you may find interesting:
ElasticDrive : Customer Apps : Amazon Web Services (http://aws.amazon.com/customerapps/921?_encoding=UTF8&fromSearch=1&queryArg=searchQuery&searchPath=customerapps&searchQuery=ISCSI&x=0&y=0)
Cloud Drive : Customer Apps : Amazon Web Services (http://aws.amazon.com/customerapps/3642?_encoding=UTF8&fromSearch=1&queryArg=searchQuery&searchPath=customerapps&searchQuery=ISCSI&x=0&y=0)

HireLogoDesign
11-08-2012, 05:43 PM
Bigger is always better when you're dealing with site up time ect.