S3 Special Characters Error
Hi there. I need to make a bulk upload but got several files with special characters (spanish file names).
I have tried with s3cmd but a unicode error shows up. Is there any option to work on utf-8 or something like that in order to include this special characters?
Can't rename the files. The folder is about 150Gb…
2 Replies
Thanks for letting us know. The special characters issue is something that our developers are aware of, and they're currently working on a fix. While I don't have a timeline for you, I can tell you it's on our radar.
In the meantime, I tested out uploading some files with special characters (including Spanish alphanumeric ones) via the FTP client Cyberduck, and they seemed to upload okay on my end. Rather than using s3cmd, it might be something worth trying out in the meantime: How to Use Linode Object Storage: Cyberduck
What Special Characters Can I Use?
I dug into this today to figure out what exactly a usable "special character" means in regards to Linode's Object Storage.
The answer is… It depends.
As a general rule of thumb, S3 can only use Unicode characters up to 1024 bytes long. You can't go wrong with playing it safe! If you stick to these generic characters here, you shouldn't run into any issues.
[0-9], [a-z], [A-Z], ["_","-","."]
Things start to break down once you start digging into the difference between Object Storage tools.
I decided to test 4 different tools (Cloud Manager, Cyberduck, Linode-CLI, S3CMD) to learn what works and what does not. For my testing I created test files for a group of characters using this python script:
# Creating a group of files for testing
symbols = ['~', ':', "'", '+', '[', '\\', '@', '^', '{', '%', '(', '-', '"', '*', '|', ',', '&', '<', '`', '}', '.', '_', '=', ']', '!', '>', ';', '?', '#', '$', ')', '/']
for i in symbols:
fn = str(symbols.index(i)) + i +'test'
print(fn)
with open(fn, 'w+') as f:
f.close()
Linode-CLI
# Adding Files to Object Storage using Linode CLI
cd /directory/full/of/test_files
for f in *
do
linode-cli obj put "$f" $TEST_BUCKET1 2> /dev/null ; echo $?
done
S3CMD
# Adding Files to Object Storage using S3CMD
for f in *
do
s3cmd put "$f" s3://$TEST_BUCKET2/ 2> /dev/null ; echo $?
done
Cloud Manager
I dragged & dropped the list of test_files into $TEST_BUCKET3 in the Cloud Manager window.
Cyberduck
I dragged & dropped the list of test_files into $TEST_BUCKET4 in the Cyberduck directory
Results
As you can see below, while third-party tools will use all of the characters tested, the Linode-CLI and Cloud Manager only allow a select few. It's also important to remember that if you are implementing any other third party tools, there may be some limitations as well. Overall, best practice would be to stick to the basic choices when naming files to save yourself a headache ([0-9], [a-z], [A-Z], ["_","-","."]).
Cloud_Manager_acceptable_characters = ["~", "-", ".", "_", "%"]
Linode_CLI_acceptable_characters = ["~", "-", ".", "_", "?", "#", "%"]
S3CMD_acceptable_characters = ['~', ':', "'", '+', '[', '\\', '@', '^',
'{', '%', '(', '-', '"', '*' , '|', ',',
'&', '<', '`', '}', '.', '_', '=', ']',
'!', '>', ';', '?', '#', '$', ')', '/']
Cyberduck_acceptable_characters = ['~',':', "'", '+', '[', '\\', '@', '^',
'{', '%', '(', '-', '"', '*', '|', ',',
'&', '<', '`', '}', '.', '_', '=', ']',
'!', '>', ';', '?', '#', '$', ')', '/']