Sort Records

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
foncesa
Posts: 42
Joined: 12 Sep 2013 00:09

Sort Records

#1 Post by foncesa » 19 Oct 2014 13:16

Hi,

I want to sort the data in flat text file, its not a csv.
The sorting point starts from digit 10 block of 9 digits. Is that possible.
111000100910044019161020140000000892200041962020100104810000000001
222000322131044003161020140000000500000053548020100105211000000001
333786214411044011161020140000001500000208826010100606311000000000
111087652356044052161020140000000815749135335260100402611000000001

RESULT:
222000322131044003161020140000000500000053548020100105211000000001
111087652356044052161020140000000815749135335260100402611000000001
333786214411044011161020140000001500000208826010100606311000000000
111000100910044019161020140000000892200041962020100104810000000001

Yury
Posts: 115
Joined: 28 Dec 2013 07:54

Re: Sort Records

#2 Post by Yury » 19 Oct 2014 13:31

Code: Select all

sort /+10 "example.txt"

foncesa
Posts: 42
Joined: 12 Sep 2013 00:09

Re: Sort Records

#3 Post by foncesa » 19 Oct 2014 14:01

Hi,

Thanks Yury, a nice one.

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: Sort Records

#4 Post by Squashman » 19 Oct 2014 18:03

You could help yourself by actually reading the help for the SORT command.

Code: Select all

C:\>sort /?
SORT [/R] [/+n] [/M kilobytes] [/L locale] [/REC recordbytes]
  [[drive1:][path1]filename1] [/T [drive2:][path2]]
  [/O [drive3:][path3]filename3]
  /+n                         Specifies the character number, n, to
                              begin each comparison.  /+3 indicates that
                              each comparison should begin at the 3rd
                              character in each line.  Lines with fewer
                              than n characters collate before other lines.
                              By default comparisons start at the first
                              character in each line.
  /L[OCALE] locale            Overrides the system default locale with
                              the specified one.  The ""C"" locale yields
                              the fastest collating sequence and is
                              currently the only alternative.  The sort
                              is always case insensitive.
  /M[EMORY] kilobytes         Specifies amount of main memory to use for
                              the sort, in kilobytes.  The memory size is
                              always constrained to be a minimum of 160
                              kilobytes.  If the memory size is specified
                              the exact amount will be used for the sort,
                              regardless of how much main memory is
                              available.

                              The best performance is usually achieved by
                              not specifying a memory size.  By default the
                              sort will be done with one pass (no temporary
                              file) if it fits in the default maximum
                              memory size, otherwise the sort will be done
                              in two passes (with the partially sorted data
                              being stored in a temporary file) such that
                              the amounts of memory used for both the sort
                              and merge passes are equal.  The default
                              maximum memory size is 90% of available main
                              memory if both the input and output are
                              files, and 45% of main memory otherwise.
  /REC[ORD_MAXIMUM] characters Specifies the maximum number of characters
                              in a record (default 4096, maximum 65535).
  /R[EVERSE]                  Reverses the sort order; that is,
                              sorts Z to A, then 9 to 0.
  [drive1:][path1]filename1   Specifies the file to be sorted.  If not
                              specified, the standard input is sorted.
                              Specifying the input file is faster than
                              redirecting the same file as standard input.
  /T[EMPORARY]
    [drive2:][path2]          Specifies the path of the directory to hold
                              the sort's working storage, in case the data
                              does not fit in main memory.  The default is
                              to use the system temporary directory.
  /O[UTPUT]
    [drive3:][path3]filename3 Specifies the file where the sorted input is
                              to be stored.  If not specified, the data is
                              written to the standard output.   Specifying
                              the output file is faster than redirecting
                              standard output to the same file.


C:\>

foncesa
Posts: 42
Joined: 12 Sep 2013 00:09

Re: Sort Records

#5 Post by foncesa » 20 Oct 2014 00:27

Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
Is it necessary to output to other file can't it be performed directly on itself.
Sorting to start from line 2 as top is header.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Sort Records

#6 Post by foxidrive » 20 Oct 2014 00:38

foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
Is it necessary to output to other file can't it be performed directly on itself.
Sorting to start from line 2 as top is header.


Sorting is a process that can be done many ways and some are faster than others, and some types of data will sort faster than others using a given sort method.
You can try different sorting programs but cmd just has the one sort tool

The sorted file has to be written to a new file no matter which tool you pick, even if it deletes the old file and uses the same name
which makes it look like it's doing it in the same file.

Yury
Posts: 115
Joined: 28 Dec 2013 07:54

Re: Sort Records

#7 Post by Yury » 20 Oct 2014 01:28

foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
Is it necessary to output to other file can't it be performed directly on itself.
Sorting to start from line 2 as top is header.


Code: Select all

<"example.txt" more +1| sort /+10 /o "example.txt"

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Sort Records

#8 Post by foxidrive » 20 Oct 2014 01:36

Yury wrote:

Code: Select all

<"example.txt" more +1| sort /+10 /o "example.txt"


Yury, I think the OP wants to preserve the 1st line too.

Yury
Posts: 115
Joined: 28 Dec 2013 07:54

Re: Sort Records

#9 Post by Yury » 20 Oct 2014 04:11

foxidrive wrote:
Yury wrote:

Code: Select all

<"example.txt" more +1| sort /+10 /o "example.txt"


Yury, I think the OP wants to preserve the 1st line too.



Code: Select all

rem There are 11 spaces before the "!header!".
<"example.txt" (set /p "header="& more +1& cmd /v:on /c echo           !header!)| sort /+10 /o "example.txt"

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: Sort Records

#10 Post by Squashman » 20 Oct 2014 07:46

foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.

As Yury has already shown you, using the /O option with the SORT command is faster than redirecting Standard Output to a file.

Sorting is going to be dependent on a lot of things. One of them being how big your file is and how fast your computer is. Should you want a faster sort you would probably have to use a 3rd party sort program that uses a different sorting algorithm. I am not sure what algorithm the SORT command uses, but if it uses a Bubble Sort, that would be slower on larger files.

foncesa
Posts: 42
Joined: 12 Sep 2013 00:09

Re: Sort Records

#11 Post by foncesa » 21 Oct 2014 05:40

Hi,

Thanks to all for suggestion and help provided to newbie.

Thanks.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Sort Records

#12 Post by foxidrive » 21 Oct 2014 06:53

Yury wrote:

Code: Select all

rem There are 11 spaces before the "!header!".
<"example.txt" (set /p "header="& more +1& cmd /v:on /c echo           !header!)| sort /+10 /o "example.txt"


This works, but you need to change the output filename as I got a zero byte file.

The header line is changed too, with leading spaces.

Samir
Posts: 384
Joined: 16 Jul 2013 12:00
Location: HSV
Contact:

Re: Sort Records

#13 Post by Samir » 21 Oct 2014 21:42

Squashman wrote:
foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.

As Yury has already shown you, using the /O option with the SORT command is faster than redirecting Standard Output to a file.

Sorting is going to be dependent on a lot of things. One of them being how big your file is and how fast your computer is. Should you want a faster sort you would probably have to use a 3rd party sort program that uses a different sorting algorithm. I am not sure what algorithm the SORT command uses, but if it uses a Bubble Sort, that would be slower on larger files.
And don't forget the storage source file read speed. A cpu can process the data many times faster than it can normally be fed. Hence why running sorts on ram drives and ssds can be faster, but not always.

Post Reply