Reorder Text With Perl Split and Splice
We were hired to edit some transcriptions. The transcriptions were in different formats. Sometimes the “time” was first, and sometimes the “who” was first. Here is one way to reorder text programatically.
Original Text to Reorder
The client gave us text in the format of “who”,”when”,”what was said”
1 2 3 4 5 6 |
Female: '00.0s' "Hello!" Male: '01.0s' "Hello!" Female: '02.0s' "Do you like my hat?" Male: '03.5s' "I do not." Female: '04.9s' "Good-by!" Male: '06.0s' "Good-by!" |
Desired Output
The client wanted text in the format of “when”,”who”,”what was said”
1 2 3 4 5 6 |
'00.0s',Female:,"Hello!" '01.0s',Male:,"Hello!" '02.0s',Female:,"Do you like my hat?" '03.5s',Male:,"I do not." '04.9s',Female:,"Good-by!" '06.0s',Male:,"Good-by!" |
Reorder Text in Arrays
There is a bit more that was required than what we are showing (commas needed to escaped), but overall, a simple Perl splice was all that was needed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
#!/usr/bin/perl use warnings; use strict; # loop over data handle while (<DATA>) { # for readability, make new variable my $line = $_; # split the data on white space, as that is our delimiter my @line = split(/ /,$line); # the very first part is "who" my $who = $line[0]; # the next chunk of our data is "when" my $time = $line[1]; # how much data is left? my $length = @line; # put the rest of the data into a new array using splice # splice takes full data, offset, and how much to cut, returning into new array my @restof = splice(@line, 2, $length); # Print it out, quoting array to auto white space # new lines were needed as they already existed print "$time,$who,@restof"; } # paste into data handle __DATA__ Female: '00.0s' "Hello!" Male: '01.0s' "Hello!" Female: '02.0s' "Do you like my hat?" Male: '03.5s' "I do not." Female: '04.9s' "Good-by!" Male: '06.0s' "Good-by!" |