Skip to main content
  1. My Blog posts/

Smalltalk fileout conversion to simple text (fileout2txt)

·398 words·2 mins·
medium
Víctor (Bit-Man) Rodríguez
Author
Víctor (Bit-Man) Rodríguez
Algorithm Junkie, Data Structures lover, Open Source enthusiast

Smalltalk fileout conversion to simple text (fileout2txt) #

Working on some Smalltalk piece of code, the Pharo flavor, needed to put it into some report. Tried a code Fileout but on report insertion all sort o marks and signals show an ugly result. Because the code was several kilobytes long doing a HAND crafted reformat was not a choice then a file inspection gave me some clues. The most annoying part was that between classes a new page was inserted then inspecting the file with hexdump show me that a simple 0x0c character was the cause of this behaviour

# hd Fileout.st  
...
00004f10 6d 65 6e 74 53 74 61 6d 70 3a 20 27 3c 68 69 73 |mentStamp: ‘<his|
00004f20 74 6f 72 69 63 61 6c 3e 27 20 70 72 69 6f 72 3a |torical>’ prior:|
00004f30 20 30 21 0d 21 0d 0d 0c 0d 46 6f 74 6f 59 6f 70 | 0!.!….FotoYop|
00004f40 45 78 63 65 70 74 69 6f 6e 20 73 75 62 63 6c 61 |Exception subcla|
...

To get rid of it a simple cat Fileout.st | tr -d \014 was enough. We also get some exclamation marks that are annoying too but they were discarded by using sed

sed -i ‘s/!//gm’ Fileout.st  

Now is the comments turn. They are internal comments and timestamps to wipe them out. grep and sed came to the rescue cat Fileout.st | grep -v commentStamp: | sed ‘s/stamp: .*//’ Finally some dashed lines that seems to be used as separators arre also deleted

grep -v “\-\- \\ — “ Fileout.st  

The basic bits were exposed here but here is a fully functional script

#!/bin/bash
## ToDo perform conversion from inside Smalltalk ?


[[ -z $1 ]] && echo "File name missing" && exit -1

__fileIn=$1
__tmp=/tmp/$$
__fileWork=${__tmp}/`basename $1`
__fileOut=${__fileIn}.txt

mkdir -p ${__tmp}


cp $1  ${__fileWork}

## Mac to Unix conversion
mac2unix  ${__fileWork} 

##  Exclamtions signs out
sed -i 's/!//gm' ${__fileWork} 

## New pages and comments wiped out 
cat ${__fileWork} | tr -d \\014 | grep -v commentStamp: | sed 's/stamp: .*//' >  ${__tmp}/tmp1

## dashed lines  too, and left final result with same name, in same folder, as original
##  plus txt ending (old DOS name inheritance)
grep -v "\-\- \\--" ${__tmp}/tmp1 > ${__fileOut}

## Final cleanup
rm  -rf ${__tmp}

echo "Converted file located at " ${__fileOut}

Enjoy !