Binary Lzw vs ASCII lzw with arabic langauge

Question

1.00/5 (1 vote)

See more:

hey all

i have a problem with my graduation project
its arabic language compression/decompression

i have done some compression algorithm i found it up here which output's a binary array to the compressed text , i've been thinking can i use lzw binary compression to maximize my output .

because arabic language is not supported by ASCII

and how should i implement binaray lzw

the algorithm that i have used before is in this link

[^]

i got a good output but i want more desirable output , so can i use lzw binary ?

and do u have a good resource to implement it in java .

thanks

Posted 28-Nov-14 6:21am

Member 11199376

Add a Solution

Comments

Tomas Takac 28-Nov-14 12:48pm

What exactly is your problem? Arabic text is just data. Sure you can apply LZW compression on it. If you are looking for a Java implementation there is one here on CodeProject: LZW Compression Algorithm Implemented in Java[^]

Member 11199376 28-Nov-14 13:23pm

My Problem is compressing 0's and 1's is more efficient than compressing the arabic text itself? ,, and sorry this implementation i saw i didn't know how to make it work !

Tomas Takac 28-Nov-14 15:34pm

I fail to see the difference. Isn't text also just bunch of 0's and 1's? The algorithm you link to in your question is very naive, LZW should give you better results - if that's what you are asking.

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Sergey Alexandrovich Kryukov · Answer 1 · 2014-11-28T12:07:00

This is not a correctly posed question. More exactly, there is no a problem with that.

Compression has nothing to do with languages. However, compression ratio does depend on the content of the data. You can only compare blocks of data of the same size in bytes. If you compress the block, the size of compressed data will be different. Isn't that totally natural? How do you think why compression is possible at all? Because the compression algorithm find some redundancy in data and tries to optimize the presentation of redundant data. Imagine that you all your data consists of binary ones. Then the compressed data should just say: "80 billions of 1 bits". And then imagine that the data is the random sequence of bits. Then the compression ratio, on average, could be slightly lower then 1, with the good algorithm and big sets, because minor amount of redundancy can only come at random. Isn't that logical?

Perhaps you could better understand it if you read about data compression: http://en.wikipedia.org/wiki/Data_compression[^].

—SA