Reverse-engineering the KaraFun file format. Part 3, the Song.ini file

This is quite simple. We look at the song.ini file and it is obvious immediately where the text and the timing information is as those are the only lines with enough numbers.

Sync0=412,469,524,1009,1041,1069,1194,1244,1297,1413,1466,1498,1544,1935,1957,1984,2152,2185,2213,2356,2380,2409,2499,2839,2867,2902,3089,3142,3299,3332,3468,3530,3584,3785,3815,3942,3981,4012,4198,4242,4673
Sync1=4698,4737,4899,4928,4966,5051,5086,5126,5186,5585,5613,5645,5813,5843,5879,5967,6036,6072,6154,6503,6530,6566,6736,6798,6989,7024,7123,7216,7264,7412,7473,7612,7641,7674,7722,7845,7935,8360,8396,8527,8555
Text0=CHÚA BIẾT RÕ
Text1=
Text2=Ngài biết rõ
Text3=những nhu cầu
Text4=của đời sống tôi

As we see, the timing is stored separately from the text, and we need to find out the way to merge them. Let’s calculate how many of Sync numbers are there total:

> grep -a ^Sync Song.ini | sed -e 's/,/\n/g' | wc -l
217

So we have 217 timing marks and only 79 text lines. So obviously more than one timing mark applies to each text line, which is actually reasonable. Let’s assume each timing mark applies to a word. For this we need to calculate the total number of words in the Text fields:

grep -ae '^Text[0-9]*= Song.ini | sed -e 's/[ \/]/\n/g' | wc -l
223

Close, but not exactly. The numbers do not match, so obviously some text fields do not use the timings. Looking at the dump above we see the “Text1=” empty line. Does it make sense to have a timing mark for an empty line? Not really. Let’s remove them from the calculation:

grep -aE '^Text[0-9]*=\w' Song.ini | sed -e 's/ /\n/g'| wc -l
218

Almost here. Let’s convert the Song.ini to an LRC file and play it to check if it is valid. One remaining issue is to guess how the time is encoded. This is however quite easy – the largest time value is 26736 so it is clearly in tens of milliseconds (i.e. divide by 100 to get the seconds). Any other divider provides a very unreasonable value, so it is easy to guess.

Here’s the converter script written in Perl:

#!/usr/bin/perl

use warnings;
use strict;

die "Usage: $0 <file>\n" if !defined $ARGV[0];

open F, "<", $ARGV[0] or die "Couldn't open $ARGV[0]: $!\n";
binmode F, ":utf8";
my @content = <F>;
close F;

# Get the sync info
my (@syncs, @text);

foreach my $line ( @content )
{
	# CRLFs
	$line =~ s/\r/\n/;
	$line =~ s/\n+/\n/;

	# Add the sync markers into the sync array
	push @syncs, split( /,/, $1 ) if $line =~ /^Sync\d+=(.*)$/;

	if ( $line =~ /^Text\d+=(\w+.*)$/ )
	{
		push @text, split( /[ \/]/, $1 );
		push @text, ""; # end of line
	}
}

# Print a fake LRC header
binmode STDOUT, ":utf8";
print "[ti: test]\n[ar: test]\n";
my $last_time;

foreach my $word ( @text )
{
	if ( $word eq "" )
	{
		print "[$last_time]\n";
		next;
	}

	# Convert the time to mm:ss.ms
	my $time = shift @syncs;
	my $min = int ( $time / 6000 );
	my $sec = int ( ($time - ($min * 6000)) / 100 );
	my $msec = int ( $time - ($min * 6000 + $sec * 100) );
	$last_time = "$min:$sec.$msec";
	print "[$last_time]$word ";
}

We test it, and voila – everything works fine. We have reverse-engineered the format, and we can integrate it into the player!

This entry was posted in android, reverse engineering.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>