Discussion forum for all Windows batch related topics.
Moderator: DosItHelp
-
foncesa
- Posts: 42
- Joined: 12 Sep 2013 00:09
#1
Post
by foncesa » 19 Oct 2014 13:16
Hi,
I want to sort the data in flat text file, its not a csv.
The sorting point starts from digit 10 block of 9 digits. Is that possible.
111000100910044019161020140000000892200041962020100104810000000001
222000322131044003161020140000000500000053548020100105211000000001
333786214411044011161020140000001500000208826010100606311000000000
111087652356044052161020140000000815749135335260100402611000000001
RESULT:
222000322131044003161020140000000500000053548020100105211000000001
111087652356044052161020140000000815749135335260100402611000000001
333786214411044011161020140000001500000208826010100606311000000000
111000100910044019161020140000000892200041962020100104810000000001
-
foncesa
- Posts: 42
- Joined: 12 Sep 2013 00:09
#3
Post
by foncesa » 19 Oct 2014 14:01
Hi,
Thanks Yury, a nice one.
-
Squashman
- Expert
- Posts: 4486
- Joined: 23 Dec 2011 13:59
#4
Post
by Squashman » 19 Oct 2014 18:03
You could help yourself by actually reading the help for the SORT command.
Code: Select all
C:\>sort /?
SORT [/R] [/+n] [/M kilobytes] [/L locale] [/REC recordbytes]
[[drive1:][path1]filename1] [/T [drive2:][path2]]
[/O [drive3:][path3]filename3]
/+n Specifies the character number, n, to
begin each comparison. /+3 indicates that
each comparison should begin at the 3rd
character in each line. Lines with fewer
than n characters collate before other lines.
By default comparisons start at the first
character in each line.
/L[OCALE] locale Overrides the system default locale with
the specified one. The ""C"" locale yields
the fastest collating sequence and is
currently the only alternative. The sort
is always case insensitive.
/M[EMORY] kilobytes Specifies amount of main memory to use for
the sort, in kilobytes. The memory size is
always constrained to be a minimum of 160
kilobytes. If the memory size is specified
the exact amount will be used for the sort,
regardless of how much main memory is
available.
The best performance is usually achieved by
not specifying a memory size. By default the
sort will be done with one pass (no temporary
file) if it fits in the default maximum
memory size, otherwise the sort will be done
in two passes (with the partially sorted data
being stored in a temporary file) such that
the amounts of memory used for both the sort
and merge passes are equal. The default
maximum memory size is 90% of available main
memory if both the input and output are
files, and 45% of main memory otherwise.
/REC[ORD_MAXIMUM] characters Specifies the maximum number of characters
in a record (default 4096, maximum 65535).
/R[EVERSE] Reverses the sort order; that is,
sorts Z to A, then 9 to 0.
[drive1:][path1]filename1 Specifies the file to be sorted. If not
specified, the standard input is sorted.
Specifying the input file is faster than
redirecting the same file as standard input.
/T[EMPORARY]
[drive2:][path2] Specifies the path of the directory to hold
the sort's working storage, in case the data
does not fit in main memory. The default is
to use the system temporary directory.
/O[UTPUT]
[drive3:][path3]filename3 Specifies the file where the sorted input is
to be stored. If not specified, the data is
written to the standard output. Specifying
the output file is faster than redirecting
standard output to the same file.
C:\>
-
foncesa
- Posts: 42
- Joined: 12 Sep 2013 00:09
#5
Post
by foncesa » 20 Oct 2014 00:27
Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
Is it necessary to output to other file can't it be performed directly on itself.
Sorting to start from line 2 as top is header.
-
foxidrive
- Expert
- Posts: 6031
- Joined: 10 Feb 2012 02:20
#6
Post
by foxidrive » 20 Oct 2014 00:38
foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
Is it necessary to output to other file can't it be performed directly on itself.
Sorting to start from line 2 as top is header.
Sorting is a process that can be done many ways and some are faster than others, and some types of data will sort faster than others using a given sort method.
You can try different sorting programs but cmd just has the one sort tool
The sorted file has to be written to a new file no matter which tool you pick, even if it deletes the old file and uses the same name
which makes it look like it's doing it in the same file.
-
Yury
- Posts: 115
- Joined: 28 Dec 2013 07:54
#7
Post
by Yury » 20 Oct 2014 01:28
foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
Is it necessary to output to other file can't it be performed directly on itself.
Sorting to start from line 2 as top is header.
Code: Select all
<"example.txt" more +1| sort /+10 /o "example.txt"
-
foxidrive
- Expert
- Posts: 6031
- Joined: 10 Feb 2012 02:20
#8
Post
by foxidrive » 20 Oct 2014 01:36
Yury wrote:Code: Select all
<"example.txt" more +1| sort /+10 /o "example.txt"
Yury, I think the OP wants to preserve the 1st line too.
-
Yury
- Posts: 115
- Joined: 28 Dec 2013 07:54
#9
Post
by Yury » 20 Oct 2014 04:11
foxidrive wrote:Yury wrote:Code: Select all
<"example.txt" more +1| sort /+10 /o "example.txt"
Yury, I think the OP wants to preserve the 1st line too.
Code: Select all
rem There are 11 spaces before the "!header!".
<"example.txt" (set /p "header="& more +1& cmd /v:on /c echo !header!)| sort /+10 /o "example.txt"
-
Squashman
- Expert
- Posts: 4486
- Joined: 23 Dec 2011 13:59
#10
Post
by Squashman » 20 Oct 2014 07:46
foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
As Yury has already shown you, using the /O option with the SORT command is faster than redirecting Standard Output to a file.
Sorting is going to be dependent on a lot of things. One of them being how big your file is and how fast your computer is. Should you want a faster sort you would probably have to use a 3rd party sort program that uses a different sorting algorithm. I am not sure what algorithm the SORT command uses, but if it uses a Bubble Sort, that would be slower on larger files.
-
foncesa
- Posts: 42
- Joined: 12 Sep 2013 00:09
#11
Post
by foncesa » 21 Oct 2014 05:40
Hi,
Thanks to all for suggestion and help provided to newbie.
Thanks.
-
foxidrive
- Expert
- Posts: 6031
- Joined: 10 Feb 2012 02:20
#12
Post
by foxidrive » 21 Oct 2014 06:53
Yury wrote:Code: Select all
rem There are 11 spaces before the "!header!".
<"example.txt" (set /p "header="& more +1& cmd /v:on /c echo !header!)| sort /+10 /o "example.txt"
This works, but you need to change the output filename as I got a zero byte file.
The header line is changed too, with leading spaces.
-
Samir
- Posts: 384
- Joined: 16 Jul 2013 12:00
- Location: HSV
-
Contact:
#13
Post
by Samir » 21 Oct 2014 21:42
Squashman wrote:foncesa wrote:Thanks Squashman,
Still i face a difficulty the file is of approx. 4/5 thosuand rows and it takes a lot of time to complete, is there a faster/max way to do that.
As Yury has already shown you, using the /O option with the SORT command is faster than redirecting Standard Output to a file.
Sorting is going to be dependent on a lot of things. One of them being how big your file is and how fast your computer is. Should you want a faster sort you would probably have to use a 3rd party sort program that uses a different sorting algorithm. I am not sure what algorithm the SORT command uses, but if it uses a Bubble Sort, that would be slower on larger files.
And don't forget the storage source file read speed. A cpu can process the data many times faster than it can normally be fed. Hence why running sorts on ram drives and ssds can be faster, but not always.