Saturday, February 20, 2010

Status

The software has been complete for a while, about december or so. I haven't touched it in a while. for now it's once again on hold.

Other status, i've been thinking of posting a large file that shows exact combinations of possible bits turned on, say 5 turned on out of 8 bits (56 combinations), although 8 bit has 256 possibilities. I've built a very fast computation program using gmp precision to allow sizes larger than 64 bit (Actually with 64bit, you could do about 67 before it would overrun).

Applications of this is limited data compression; the downside is you need to know how many bits are turned on and the length, removing the advantage of using it. It is especially useful if you had say 20 out of 80 bits turned on (Like Keno). I'll post two different pages of data soon, depending on where i can host them.

Format: Bits on, combinations, size in bits.
0 - 1 (1)
1 - 8 (3)
2 - 28 (5)
3 - 56 (6)
4 - 70 (7)
5 - 56 (6)
6 - 28 (5)
7 - 8 (3)
8 - 1 (1)

Monday, June 15, 2009

Semeir Status

Currently the project is looking very good, I've been able to work on it for a couple dozen hours. The main functions are all done, working, and currently is being brute tested and stress tested.

On a bright note, here are some little tidbits I see so far.

  1. The compiled+stripped code for the entire cipher, is about 5k, and the actual cipher file is only about 2k. The code size suggests it will be workable in an embedded environment.


  2. The keysize (if we optimize for size) for a good 4-8byte cipher, will likely take anywhere from 300-1500 bytes total, plus needing a 1k stack or less.

  3. thread-safe; after the key is generated, using the encrypt/decrypt functions make small copies of working data to the local space, making it thread safe.

  4. Speed; it is very fast; unlike the previous schema this has fixed limits and can easily optimized a lot more for speed. which also translates for size. Due to the keysize being rather small, other speedups will be noticed when implemented in the kernel.

Once testing has finished and the last few functions written, along with a new 'simple' program for encryption and PRNG, the code will likely be ready for release.

Thursday, June 4, 2009

Semeir from scratch

It is truly amazing how much one can learn over a period of time, as well as how much your style of coding changes.

After hard and careful consideration and trying to keep code from the old project, I have decided to start completely over from scratch. Looking carefully at my goals, I've re-thought and redone the API for the library, and I'm happy to say it's a lot smaller and simpler.

Goals:
  • Simple Cipher: The new public API will only have about 7 functions. The encryption also only has about 4 rules/steps to follow.

  • Documented & Tested: Following the advice of 'clean code' I'm writing documentation before working on the code. Test code will be written to test the code, and failure behavior.

  • Less Memory: Key data for a Standard key can be as small 1k, or as large (estimated max) as a 128k.

  • Cipher & PRNG: Build what we are working for, and not everything.


Depending on how busy I get, I foresee this being done by the end of the year, or earlier. I just hope this newer version is accepted more by the community than my previous one.

Saturday, May 24, 2008

Java Experience

Over the course of the last several weeks, i have learned java, and after being part of a team larger than myself, i have come to a sudden realization. I have been doing things wrong, and i must apologize.

So in the upcoming version, which may be a few months yet, i'm making several large changes and will worry about improving code to a large degree another time.

1) Eclipse CDT IDE. Before i've entirely build the cipher in a basic text editor, and that brings many annoyances when you're needing to look up functions, parameter orders, names, and descriptions. It's been a wonderful tool so far, and makes refactoring and building code easier. I'll be putting in more descriptive and appropriate names to all variables.

2) Doxygen for inline documentation. My cipher is under-documented, and after reading several good articles, references and suggestions, i'm going to go completely through and re-do all of my comments from over brief to in-depth when appropriate. As well as have doxygen generate my docs later.

3) I've tried before to over optimize my code, and i realize after looking at a couple of the functions, they are kind of a mess. This isn't what i was going for. I'm going to rebuild them for readability and simplicity, rather than speed. I figure, computers are getting faster, and compilers are getting smarter, i will instead break it into something everyone can read and maintain, and leave the cycle saving for the compiler.

4) Unit testing. I'm met a wonderful thing called JUnits, which once understood enough, they make sense. I'm systematically making a test for each file i'm updating with commenting, with a unit test, which when all combined will help in the long run. And let my earlier test.c be.

Estimated next release, August sometime.

Era Scarecrow

Wednesday, April 2, 2008

Snake Oil

After my first release in a long time. Version 2.4.7, a couple of individuals have called semeir as having a large chance of being 'snake oil'. However as much as i dis-agree, i am forced to agree they are correct that i don't have enough statistical data to prove my cipher is 'secure'. However, from my own tests it's working, however i'll let the data speak for itself.

They've also claimed that declaring i'm using a OTP (One Time Pad) or claiming i have one is bad practice and misleading. It's true it's not a true One Time Pad, (working on pencil and paper with a true list of random data) But nothing is truly random anyways. If you consider that a One Time Pad uses a random stream of data for it's security; that's exactly what my OTP functions do, they double as multiple parallel RNG's if you want them that way.

http://lwn.net/Articles/274857/


http://www.geocities.com/rtcvb32/zips/dh.tbz2.zip <- PRNG tests with Diehard, 1Gig

Next bits of data come from this, /apps/misc_tests/otp.c uses a 9 blocksize by default and it's number of returns before a repeat is:

9 - 10,625,324,586,456,701,730,816 Ints (38,654,705,664 -TeraBytes)

I Used a 4Gig random data block; the results are pretty astounding.
1's 17,179,831,574 - 49.999890540493652224540710449219
0's 17,179,906,794 - 50.000109459506347775459289550781
(50% within 1/10000th for being 1:2508.253 the size of the full length)
Currently, it will take a couple years before the code is accepted, and tests proving it's worth using. There are so many advantages in it already, however slower and using more resources than other ciphers, it still does quite well.

However that doesn't stop you from trying it out and using it as a 'Experimental alternative'. If you feel it's not secure enough for you, that's ok.

Era

Wednesday, March 12, 2008

Compression Idea #4

I've had a multitude of ideas, but recently i've decided to put one to the test.

I know yet again i drift from encryption, but this may be worth it.

I've considered the idea, that you take data, turn it into a puzzle with some redundancy, and then remove all the data you can, and compress the remainder that you can rebuild with. With this in mind i'm talking of something similar to Sudoku, although it may be more than 9x9, it may be more like 25x25 or something, too large for humans (shy of very very intelligent and bored people) doing. The exact method of this is uncertain as of yet. But i'll give you some information what i got so far.

1) I've successfully created a virtual sudoku environment, which works since it looks and feels like soduko, but can increase, but due to testing and building purposes can't go beyond the 9x9 yet.

2) I've programmed only one method (Called method1), which does the simplest of calculations and adds on the more obvious solutions, using this alone can solve puzzled rated easy, and maybe intermediate. I haven't fully tested, but the results were so astounding for a first run with no debugging i was surprised.

There are more methods to add, i think in the range of 4-5 more. When successfully able to do any puzzle on the go in seconds, the algorithm is ready for the next stages. Which i will outline now.

1) Build solver algorithm
2) Build converter from/to sudoku
3) Build number remover, which selectively removes numbers as many as it can, till it can do no more. (Expert level hopefully)
4) Experiment with compression until proper method is made that will hopefully compress the data down enough to compensate for the data, and make the whole venture worth it.

This is all going in hopes that the puzzle compressed is smaller then the original data was to build the puzzle in the first place. But with only one method, results for a solver (even if compression fails) will be promising enough to complete the project.

Era

Wednesday, March 5, 2008

OTP (One Time Pad) Improved!

I've done some major improvement on the program, and have a working windows GUI using AutoHotKeys. I may go away from this eventually, but i see no reason just yet.

A little history. Originally, the OTP was a simple RNG you could have multiple streams with and working. However it only returned 3Billion ints, and you could only seed it once.

New version, still in testing stages but has no problems so far, you can seed it as many ints as it's blocksize will let, and of course, the returnable INT's before it repeats or starts over is as this list shows.

Note, i am not sure if Decabyte is right, but it's the term i'm using.

k 2^10 (1024)
Mb 2^20 (1,048,576)
Gb 2^30 (1,073,741,824)
Tb 2^40 (1,099,511,627,776)
Pb 2^50 (1,125,899,906,842,624)

1 - 32 (256 bytes)
2 - 32,768 (32k)
3 - 12,582,912 (49,512k)
4 - 4,294,967,296 (16,384Mb)
5 - 1,374,389,534,720 (5,120 - Gigabytes)
6 - 422,212,465,065,984 (1,572,864 -Gigabytes)
7 - 126,100,789,566,373,888 (458,752 -Terabytes)
8 - 36,893,488,147,419,103,232 (134,217,728 -TeraBytes)
9 - 10,625,324,586,456,701,730,816 (38,654,705,664 -TeraBytes)
10 - 3,022,314,549,036,572,936,765,440 (10,995,116,277,760 -TeraBytes)
11 - 851,083,777,008,698,938,993,147,904 (3,023,656,976,384 -Petabytes)
12 - 237,684,487,542,793,012,780,631,851,008 (70,368,744,177,664 -Petabytes)
13 - 65,917,831,211,867,928,877,828,566,679,552 (234,187,180,623,265,792 -Petabytes)

These are figured characteristics, usually the returns are using 2xored returns, so it's half the values, but still more than enough for any normal message, probably enough to use on your drive to randomize it before you put a encrypted FS on it.

I will have more later.

Era