Click here to Skip to main content
15,896,557 members
Please Sign up or sign in to vote.
3.50/5 (4 votes)
See more:
I have a long string (around 50 characters)and need to insert into database.
however, the length of field in database is only 10 characters (i cannot change it !!)
i know there should be a way to hash or encrypt the string so it can be shorten, then later i should decrypt when i want to use string again.

what is the best way to do this?
please note security is not important and all i need to convert it to another set of characters which are short and i can convert it back to original string.
I am looking for a c# solution.

appreciate your response in advance.
Posted
Comments
Wonde Tadesse 2-May-11 20:19pm    
Interesting question!

0) You can't take 50 characters, and make it 10. That's impossible.

1) A hash cannot be reversed, so hashing won't work.

2) An encrypted string will pretty much ALWAYS be a minimum of 16 characters.

3) You could always create a new table with an id column that auto-increments and aa stringfield that is x-characters long (I recommend 255 characters), and store the data in the new table. You can then get the ID that was created for it, and stored the ID in the 10-character column of your main table, thus setting up a proxy relationship between the two tables. After that, you can hit retrieve the appropriate data in the new table based on the id stored in the main table.

EDIT ==================

Since you can't modify the database in any way, your only other viable solution is to store the 50-character data in a file on the disk (I'd go with XML myself), and do the same relationship thing I talked about above. Of course, this won't work if another user has to get to the data as well, unless the XML file is on a shared storage resource.

You could also change the form where the data is coming from, and restrict the input to 10 characters.

If the data is coming from an outside source, just use the first 10 characters submitted, and throw the rest away.
 
Share this answer
 
v5
Comments
NuttingCDEF 2-May-11 15:01pm    
Good solution (specifically point 3) - my 5.Two things I'd add:

1. Note that normal Guids (which are a common choice for ID values) are 16-byte, so can't easily be fitted into 10 chars - so probably need to use an integer auto-increment field not a Guid.

2. An alternative to a separate table (if your design permits) might be to split the 50 chars into maybe 6 substrings and store the chunks in separate records in your table (add an index character [1 to 6 or 0 to 5] on the front to identify the order of chunks - hence why 5 chunks of 10 chars may not work). If the problem is that you can't change ANYTHING about the database structure (including adding a new table), that might still work.
#realJSOP 2-May-11 15:16pm    
I didn't say to use a guid. I said create an ID column. In any case, it's a moot point because the OP said he can't modify the database.
NuttingCDEF 2-May-11 15:29pm    
Agreed - I was more making the point that if, like many, Guid was the preferred choice for ID, it might not work.

And yes, if he can't modify the database I totally agree that it may be irrelevant - though from his later 'solutions' it looks as though he may well have to sidestep his database constraint somehow.
Wonde Tadesse 2-May-11 20:23pm    
Good answer.
You need to create your own algorithm to do this. 50 to 10 is not going to be easy. Your algorithm can be like, if you have repetitive letters in a word, you can instead present it as a single letter together with the number of repetitions.
Like here's a simple ex: if there's a word like AAAABBTTTT, you write it as A3B2T4. You can see the no. of letters have been reduced. But it was a very simple example.
Here is another ex, lets say a word BAT occurs frequently, you can substitute that with a "@" character. Likewise you can try to create your own methods or rules to do it. I hope I was able to help you to an extent.
Good luck.
 
Share this answer
 
Comments
NuttingCDEF 2-May-11 15:06pm    
Agreed - but 50 chars into 10 (or even 16) is a tough call - under 2 (or under 3) bits per char. You might get there with Huffman / LZ etc. for longer strings, but getting that level of compression on a short string relies on being able to exploit known features of the strings.
Tarun.K.S 2-May-11 15:12pm    
Aah right Huffman compression, now I remember the exact name. Hmm you are right, getting up to 10-15 is tough. The algorithm will be a bit complex but if its the requirement, he has to do it somehow!
NuttingCDEF 2-May-11 15:19pm    
Depending on the size of the the space of possible strings to be compressed / distribution of strings in the space, there are theoretical limits to what is possible - e.g. if all possible 50 char strings are equally common/likely and the character set uses all values from 0 to 255 then no compression at all is possible!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900