Page 1 of 1

Sort Records

Posted: 19 Oct 2014 13:16
by foncesa
Hi,

I want to sort the data in flat text file, its not a csv.
The sorting point starts from digit 10 block of 9 digits. Is that possible.
111000100910044019161020140000000892200041962020100104810000000001
222000322131044003161020140000000500000053548020100105211000000001
333786214411044011161020140000001500000208826010100606311000000000
111087652356044052161020140000000815749135335260100402611000000001

RESULT:
222000322131044003161020140000000500000053548020100105211000000001
111087652356044052161020140000000815749135335260100402611000000001
333786214411044011161020140000001500000208826010100606311000000000
111000100910044019161020140000000892200041962020100104810000000001

Re: Sort Records

Posted: 19 Oct 2014 13:31
by Yury

Code: Select all

sort /+10 "example.txt"

Re: Sort Records

Posted: 19 Oct 2014 14:01
by foncesa
Hi,

Thanks Yury, a nice one.

Re: Sort Records

Posted: 19 Oct 2014 18:03
by Squashman
You could help yourself by actually reading the help for the SORT command.

Code: Select all

C:\>sort /?
SORT [/R] [/+n] [/M kilobytes] [/L locale] [/REC recordbytes]
  [[drive1:][path1]filename1] [/T [drive2:][path2]]
  [/O [drive3:][path3]filename3]
  /+n                         Specifies the character number, n, to
                              begin each comparison.  /+3 indicates that
                              each comparison should begin at the 3rd
                              character in each line.  Lines with fewer
                              than n characters collate before other lines.
                              By default comparisons start at the first
                              character in each line.
  /L[OCALE] locale            Overrides the system default locale with
                              the specified one.  The ""C"" locale yields
                              the fastest collating sequence and is
                              currently the only alternative.  The sort
                              is always case insensitive.
  /M[EMORY] kilobytes         Specifies amount of main memory to use for
                              the sort, in kilobytes.  The memory size is
                              always constrained to be a minimum of 160
                              kilobytes.  If the memory size is specified
                              the exact amount will be used for the sort,
                              regardless of how much main memory is
                              available.

                              The best performance is usually achieved by
                              not specifying a memory size.  By default the
                              sort will be done with one pass (no temporary
                              file) if it fits in the default maximum
                              memory size, otherwise the sort will be done
                              in two passes (with the partially sorted data
                              being stored in a temporary file) such that
                              the amounts of memory used for both the sort
                              and merge passes are equal.  The default
                              maximum memory size is 90% of available main
                              memory if both the input and output are
                              files, and 45% of main memory otherwise.
  /REC[ORD_MAXIMUM] characters Specifies the maximum number of characters
                              in a record (default 4096, maximum 65535).
  /R[EVERSE]                  Reverses the sort order; that is,
                              sorts Z to A, then 9 to 0.
  [drive1:][path1]filename1   Specifies the file to be sorted.  If not
                              specified, the standard input is sorted.
                              Specifying the input file is faster than
                              redirecting the same file as standard input.
  /T[EMPORARY]
    [drive2:][path2]          Specifies the path of the directory to hold
                              the sort's working storage, in case the data
                              does not fit in main memory.  The default is
                              to use the system temporary directory.
  /O[UTPUT]
    [drive3:][path3]filename3 Specifies the file where the sorted input is
                              to be stored.  If not specified, the data is
                              written to the standard output.   Specifying
                              the output file is faster than redirecting
                              standard output to the same file.


C:\>

Re: Sort Records

Posted: 20 Oct 2014 00:27
by foncesa
Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
Is it necessary to output to other file can't it be performed directly on itself.
Sorting to start from line 2 as top is header.

Re: Sort Records

Posted: 20 Oct 2014 00:38
by foxidrive
foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
Is it necessary to output to other file can't it be performed directly on itself.
Sorting to start from line 2 as top is header.


Sorting is a process that can be done many ways and some are faster than others, and some types of data will sort faster than others using a given sort method.
You can try different sorting programs but cmd just has the one sort tool

The sorted file has to be written to a new file no matter which tool you pick, even if it deletes the old file and uses the same name
which makes it look like it's doing it in the same file.

Re: Sort Records

Posted: 20 Oct 2014 01:28
by Yury
foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
Is it necessary to output to other file can't it be performed directly on itself.
Sorting to start from line 2 as top is header.


Code: Select all

<"example.txt" more +1| sort /+10 /o "example.txt"

Re: Sort Records

Posted: 20 Oct 2014 01:36
by foxidrive
Yury wrote:

Code: Select all

<"example.txt" more +1| sort /+10 /o "example.txt"


Yury, I think the OP wants to preserve the 1st line too.

Re: Sort Records

Posted: 20 Oct 2014 04:11
by Yury
foxidrive wrote:
Yury wrote:

Code: Select all

<"example.txt" more +1| sort /+10 /o "example.txt"


Yury, I think the OP wants to preserve the 1st line too.



Code: Select all

rem There are 11 spaces before the "!header!".
<"example.txt" (set /p "header="& more +1& cmd /v:on /c echo           !header!)| sort /+10 /o "example.txt"

Re: Sort Records

Posted: 20 Oct 2014 07:46
by Squashman
foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.

As Yury has already shown you, using the /O option with the SORT command is faster than redirecting Standard Output to a file.

Sorting is going to be dependent on a lot of things. One of them being how big your file is and how fast your computer is. Should you want a faster sort you would probably have to use a 3rd party sort program that uses a different sorting algorithm. I am not sure what algorithm the SORT command uses, but if it uses a Bubble Sort, that would be slower on larger files.

Re: Sort Records

Posted: 21 Oct 2014 05:40
by foncesa
Hi,

Thanks to all for suggestion and help provided to newbie.

Thanks.

Re: Sort Records

Posted: 21 Oct 2014 06:53
by foxidrive
Yury wrote:

Code: Select all

rem There are 11 spaces before the "!header!".
<"example.txt" (set /p "header="& more +1& cmd /v:on /c echo           !header!)| sort /+10 /o "example.txt"


This works, but you need to change the output filename as I got a zero byte file.

The header line is changed too, with leading spaces.

Re: Sort Records

Posted: 21 Oct 2014 21:42
by Samir
Squashman wrote:
foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.

As Yury has already shown you, using the /O option with the SORT command is faster than redirecting Standard Output to a file.

Sorting is going to be dependent on a lot of things. One of them being how big your file is and how fast your computer is. Should you want a faster sort you would probably have to use a 3rd party sort program that uses a different sorting algorithm. I am not sure what algorithm the SORT command uses, but if it uses a Bubble Sort, that would be slower on larger files.
And don't forget the storage source file read speed. A cpu can process the data many times faster than it can normally be fed. Hence why running sorts on ram drives and ssds can be faster, but not always.