Page 1 of 1

ANY2BAT BASE85

Posted: 05 Feb 2014 16:44
by einstein1969
Hi to all,

I started a bath to include any file in BAT, or other compatible with the chosen dictionary.

This uses base85 characters that seem not to cause problems. (I'm no expert in this!)

For compressed files the ratio of wasted space is quite low.

I tested on GetInput.exe.cab 566 bytes and the output is approximately 695 bytes.

This is the beta and if someone wants to help me test it or finish it I'd be grateful.

Code: Select all

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: ANY2BAT.cmd - Read any file and convert in Base85 or other using ASCII chars
:: ------------------------------------------------------------------------------
:: author: einstein1969
:: BETA!
::
:: 6/2/2014 Added code base 41 to 84
::
::
:: TODO: convert final bytes of file in buffer, Functional Test.
:: Add CRC check
::
:: For Binary to Text encoding : http://en.wikipedia.org/wiki/Binary-to-text_encoding
::
:: There are at least 2 type of algorithms for encoding. One work on Byte and the
:: other work on bit. In this post there is the example that work on bit:
::
:: http://www.dostips.com/forum/viewtopic.php?f=3&t=4963
::
:: If the alphatet uses the same order of ascii code it is possible using FC reading
:: technique for decode. This is very simple.
::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: %1 = input file

@echo off & setlocal EnableDelayedExpansion


:: ------------------------------------------------------------------------------
:: Alphabet. You can change order and type of characters.
:: The decode subroutine must only get character position.
:: There are 86 char. Last char encode 0x00000000. Good for makecab zero! 1 byte for 5 char (save 4 byte)

set alph85=#$'^(^)*+,-./0123456789:;=?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]_`abcdefghijklmnopqrstuvwxyz{}~


:: This Alphabet uses one more char for trick of zero. You can change the simbol.

set alph52=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz~


:: ------------------------------------------------------------------------------
:: create dummy file (of @=40Hex) for compare with FC. Jeb's technique.

<nul >d.tmp set /p ".=@" & set ds=%~z1 & for /l %%n in (1,1,32) do (if !ds! gtr 0 set /a "ds/=2" & type d.tmp >>d.tmp)


:: ------------------------------------------------------------------------------
:: Read the file and encode.

For %%c in (encode85 encode52) do (

  set output=
  set /A i=1, buffer=0, p=0, bi=bo=0

  for /f "eol=F skip=1 tokens=1,2 delims=: " %%a in ('fc /b "%~dpf1" d.tmp') do (
    for /L %%B in (!i!, 1, 0x%%~a) do call :%%c 40
    call :%%c %%~b
    set /a i=0x%%~a+2
  )

  echo(
  echo %%c: [Bytes input: !bi! - Bytes output: !bo!]
  echo !output!

)

:: ------------------------------------------------------------------------------
:: Del dummy file

del d.tmp


goto :eof


:: ----------------------------------------------
:encode85 %1 = HEXByte
::
:: Queue the byte and every 4 bytes (32bit) encode to 5 Bytes/Ascii (efficiency ratio 4/5 = 80%) (20% wasted)
:: The batch integer not allow divide not signed. So i have substract for negative number a
:: quantity of 0x8000002A. This quantity is perfectly divisible for 85(decimal), ie no carry/rest.
:: The number 0x1818182 is the 0x8000002A/85(dec).

rem questo si può realizzare in stringa : set buffer=!buffer!%1 e si può spostare fuori dalla chiamata.
rem cosi chiamo ogni 4 volte!

set /a "buffer=(buffer << 8)|0x%1, p+=1, bi+=1"

if !p! gtr 3 (
  if !buffer! equ 0 ( set "output=!output!!alph85:~85,1!" & set /a bo+=1
  ) else (
    set /a start=0, bo+=5
    if !buffer! lss 0 set /a "n0=(buffer-0x8000002A) %% 85, buffer=(buffer-0x8000002A)/85+0x1818182, start+=1"
    for /L %%p in (4,-1,!start!) do set /a "n%%p=buffer %% 85, buffer/=85"
    for /f "tokens=1-5" %%f in ("!n0! !n1! !n2! !n3! !n4!") do set "output=!output!!alph85:~%%f,1!!alph85:~%%g,1!!alph85:~%%h,1!!alph85:~%%i,1!!alph85:~%%j,1!"
  )
  set /a buffer=0, p=0
)
goto :eof


:: ----------------------------------------------
:encode52 %1 = HEXByte
::
:: Queue the byte and every 4 bytes (32bit) encode to 6 Bytes/Ascii (efficiency ratio 4/6 = 67%) (33% wasted)
:: This work for base 41 to base 84.

set /a "buffer=(buffer << 8)|0x%1, p+=1, bi+=1"

if !p! gtr 3 (
  if !buffer! equ 0 ( set "output=!output!!alph52:~52,1!" & set /a bo+=1
  ) else (
    set /a start=0, bo+=6
    if !buffer! lss 0 set /a "n0=(buffer-0x8000001C) %% 52, buffer=(buffer-0x8000001C)/52+0x2762763, start+=1"
    for /L %%p in (5,-1,!start!) do set /a "n%%p=buffer %% 52, buffer/=52"
    for /f "tokens=1-6" %%f in ("!n0! !n1! !n2! !n3! !n4! !n5!") do set "output=!output!!alph52:~%%f,1!!alph52:~%%g,1!!alph52:~%%h,1!!alph52:~%%i,1!!alph52:~%%j,1!!alph52:~%%k,1!"
  )
  set /a buffer=0, p=0
)
goto :eof


result:

Code: Select all

>any2bat GetInput.exe.cab

encode85: [Bytes input: 566 - Bytes output: 685]
?pLJQ~6EKq`~31'`S~#z*K*#BB.$~=N=fq#BB1'##]K+~##'AW1'/V*;xNk:JQ+gE3uQ?C#0pE.HU:(H
'$-*Z}TRO`4il\(#s`1C5MP4+`P0mm]fO.NcvE(OG,c`JBjPGDQj]Y9iT:Mh=P{p6q[A[AJqSceay$Ce
B3)+w8MJ#G$]26NqR$}ZB+CVv3l{B*Qa-T8=E{wc)K(`r-lQNaqs-1$JaX,1@oLfV=3}9YYP5If)qShE
uS?$ELX/SQOXc7KBa\Ms/M[;W1po7u@}AUJ+qH3-4[mm_fZuhSA't'CAkeeQ5m*hbh`Z_roj+E+o_dRa
A(.W*veg=(C/o2sN@??D1*K(CNf@LP6,`dr.4EtXob37YNU[')EA,TI7hSk_5d-K#Ep*xY$1J@-uxRg{
UUJdT)rRgkGMBa#'mwQGLh_B0=N_qM6@r('3B{)YOJPiTjULTiJ\ftmD\@v`F`8+l-*We6sQ*pC)U/6G
#{.xQOMS)a{P=Fbid?Qr?a1ZJ9h1`O)(Rv+;4R;+4ch(yiP@bsG0xmKF(5[j9unM+52SGS/@_d12`0'6
_S#$ipQKB}1t;(FAo7zY4px@]K7;68'x7xs@/c-5ewy-.D6s+KC2MuNJs7m1,g?Ux\PQov27CCr2:4a1
'}ha=FRE,rH7tsB3S+MGAf_aIHt.xjIl3\z]dw0)[kIDx

encode52: [Bytes input: 566 - Bytes output: 821]
DVWTdy~CTwIWI~BwyCBs~AGuWFcACPQeo~DLaPKIACPQjkAACpVs~AAAGJQBhsOAgDHqwwJEtValQCCX
ybJAAzngGUGSJSvAOTOfHoGMwGGwJBqwUAGSXqHCNYDvkCFhksVsIeAhpmKnUiDEVFgGBDqSGSbMJHou
NZGLDwXIFnnXLVHJZLAsKEwXIQLGVCBDlxEfGCitaQcALjoeIYFvIWjDlPVDyCBmoFVvHriYoDKqDzSA
feVtxBKtJISFKQHNWYHvDerDcZOOODJYKwHzFiEuMAimHATUGEHsvOGmlJSFdnBdSDpfWdiBWhHnOBnp
GQqDddYuoKKDyCyCHbXGYiHGhRhDdtGYGDjQOsxCQBhvsgHoHeQXJDmKnSHhnAqDduDzrkIYqcqDsqQR
VFUdIXdBiGjXwFXiqtcCSAlnuBNcSAXXHzOdPHGUojyEITWGaCdwQdQUINDnBEMMNOdAIJYjRaLCrCKQ
GTVXMAiqXWfEXjTNqAUHxWkFJefAmDLbxBgCTcMIBByPjemFcczQCXGTdWBLHTixiJHQdXNwHlVPWBFe
RZHqFoSBURGROBzAGzBQCFcqKnjwFijPwZIOVzODUHeazCsJIhMAcsSdGDCTLdsuIKfWzAFitWsEVYJF
YENejjADKjhuMCLTCchBVccWOBtbUdftGCHDJtFqvHcBoEDTgDjfdFPOKAojSEycmIsAVAtxLDXTxKOu
KyquUEAcRdrDtBnabBKRBtMBDPImgdFkQSiBqSQTdBqentqAVUkeSEQyEShEdJpUJeGCjHkDiuHGGwKX
WVLnJSktXwITAdjyJNLmh


EDIT: Added comments

einstein1969

Re: ANY2BAT BASE85

Posted: 05 Feb 2014 18:19
by einstein1969
I have added comments.

einstein1969

Re: ANY2BAT BASE85

Posted: 05 Feb 2014 18:47
by carlos
I like it, but maybe characters like: * : will cause problems for a decode.

I would like base 52: a-z A-z it is really simple.

Re: ANY2BAT BASE85

Posted: 06 Feb 2014 08:52
by einstein1969
carlos wrote:I like it, but maybe characters like: * : will cause problems for a decode.

I would like base 52: a-z A-z it is really simple.


I have added for show.

einstein1969

Re: ANY2BAT BASE85

Posted: 08 Feb 2014 08:03
by dbenham
Nice work, but you have 2 serious bugs that should be easy to fix.

1) Your code assumes the file length is a multiple of 4. The remainder (modulo 4) is dropped from your output.

2) Your code completely ignores any number of trailing @ at the end of the file.


Dave Benham