This is quite simple. We look at the song.ini file and it is obvious immediately where the text and the timing information is as those are the only lines with enough numbers.
Sync0=412,469,524,1009,1041,1069,1194,1244,1297,1413,1466,1498,1544,1935,1957,1984,2152,2185,2213,2356,2380,2409,2499,2839,2867,2902,3089,3142,3299,3332,3468,3530,3584,3785,3815,3942,3981,4012,4198,4242,4673 Sync1=4698,4737,4899,4928,4966,5051,5086,5126,5186,5585,5613,5645,5813,5843,5879,5967,6036,6072,6154,6503,6530,6566,6736,6798,6989,7024,7123,7216,7264,7412,7473,7612,7641,7674,7722,7845,7935,8360,8396,8527,8555
Text0=CHÚA BIẾT RÕ Text1= Text2=Ngài biết rõ Text3=những nhu cầu Text4=của đời sống tôi
As we see, the timing is stored separately from the text, and we need to find out the way to merge them. Let’s calculate how many of Sync numbers are there total:
> grep -a ^Sync Song.ini | sed -e 's/,/\n/g' | wc -l 217
So we have 217 timing marks and only 79 text lines. So obviously more than one timing mark applies to each text line, which is actually reasonable. Let’s assume each timing mark applies to a word. For this we need to calculate the total number of words in the Text fields:
grep -ae '^Text[0-9]*= Song.ini | sed -e 's/[ \/]/\n/g' | wc -l 223
Close, but not exactly. The numbers do not match, so obviously some text fields do not use the timings. Looking at the dump above we see the “Text1=” empty line. Does it make sense to have a timing mark for an empty line? Not really. Let’s remove them from the calculation:
grep -aE '^Text[0-9]*=\w' Song.ini | sed -e 's/ /\n/g'| wc -l 218
Almost here. Let’s convert the Song.ini to an LRC file and play it to check if it is valid. One remaining issue is to guess how the time is encoded. This is however quite easy – the largest time value is 26736 so it is clearly in tens of milliseconds (i.e. divide by 100 to get the seconds). Any other divider provides a very unreasonable value, so it is easy to guess.
Here’s the converter script written in Perl:
#!/usr/bin/perl use warnings; use strict; die "Usage: $0 <file>\n" if !defined $ARGV[0]; open F, "<", $ARGV[0] or die "Couldn't open $ARGV[0]: $!\n"; binmode F, ":utf8"; my @content = <F>; close F; # Get the sync info my (@syncs, @text); foreach my $line ( @content ) { # CRLFs $line =~ s/\r/\n/; $line =~ s/\n+/\n/; # Add the sync markers into the sync array push @syncs, split( /,/, $1 ) if $line =~ /^Sync\d+=(.*)$/; if ( $line =~ /^Text\d+=(\w+.*)$/ ) { push @text, split( /[ \/]/, $1 ); push @text, ""; # end of line } } # Print a fake LRC header binmode STDOUT, ":utf8"; print "[ti: test]\n[ar: test]\n"; my $last_time; foreach my $word ( @text ) { if ( $word eq "" ) { print "[$last_time]\n"; next; } # Convert the time to mm:ss.ms my $time = shift @syncs; my $min = int ( $time / 6000 ); my $sec = int ( ($time - ($min * 6000)) / 100 ); my $msec = int ( $time - ($min * 6000 + $sec * 100) ); $last_time = "$min:$sec.$msec"; print "[$last_time]$word "; }
We test it, and voila – everything works fine. We have reverse-engineered the format, and we can integrate it into the player!
Some files however are encrypted. How to deal with encryption? See part 4!
Hi, i found something strange…
I have a KFN file with a 3:46 track.I exported a song.ini file from it.
It has 72 lines of text (no empty lines). If we consider all the markers (spaces and slashes), then they are about 430.
And in the Song lines, there are more than 500 timing marks. Moreover, there are timing marks that go beyond the track length (4:10 and more).
Have not met with this?