Wednesday, November 15, 2006

 

Finding a keyword in an encoded file.

While testing a codec I needed to find out how many milliseconds of audio is in the encoded file.I decode this data into Wave format.

The encoded file.
c:\usr\shankar\lib>cat packed.pcm |od -h |head
000000 6b21 0fc5 0457 6f00 065d 58c8 6b21 0dc5
000010 0c3b de36 a172 8a89 6b21 0dc5 cc97 ef2c
000020 8381 bc56 6b21 0b63 849b 300a 4b8c 927b
000030 6b21 0f65 5c3b bb40 c56d 7567 6b21 0dc5
000040 943b a645 8869 94a9 6b21 0d45 143b 7314
000050 8469 f256 6b21 7f75 1c3b 1fa6 e571 22c7
000060 6b21 cb63 5c3a 5708 796c 2a94 6b21 6f75
000070 043b 1d27 1171 2201 6b21 0d45 e43b 9735
000080 7c6f 9e03 6b21 7f75 c43b 1f32 3875 a76c
000090 6b21 0dc5 f437 e5ad f36c b14f 6b21 cb63
Ten millisecond worth of encoded data is delimited by 6b21. So if I count the number of 6b21's in the encoded data, I get the total number of millisecond.

Here is one way to do it.
c:\usr\shankar\lib>cat packed.pcm |od -h |cut -d' ' -f2-| tr `\ ``\n` |grep 6b21 |wc
300 2700 14100
There is 3 Sec (300 *10 millisecond) of audio in this file.

Explanation:
The command cut -d' ' -f2- cuts the first column which is address field
The tr command translates space to linefeed as shown below.
c:\usr\shankar\lib>cat packed.pcm |od  -h |cut -d' ' -f2- |tr '\ ' '\n'|head
6b21
0fc5
0457
6f00
065d
58c8
6b21
0dc5
0c3b
de36
Why in this format?
Because I just have to extract 6b21's using grep as shown below
c:\usr\shankar\lib>cat packed.pcm |od  -h |cut -d' ' -f2- |tr '\ ' '\n'|grep 6b21 |head
6b21
6b21
6b21
6b21
6b21
6b21
6b21
6b21
6b21
6b21
Now wc will only count 6b21's!. All the above unix commands are available for Windows by installing Cygwin or Djgpp.

Labels: ,


Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?