Worked on adding specs and coverage to both repos, and implemented Quasar as planned. Added all models, actions, views, services and workers. Has to switch to another service (ipstack) for doing geo-lookup of IP addresses when processing sessions. Working now on getting all specs passing, and then adding routes and swagger API documentation generation before testing the authentication flow. After that I’ll get working on the template repo to expose all of these features in the UI. Fun!
Sorted out GitHub actions to get both the Punk! and Let’s Punk! projects to run tests and to ship to production when a release is created from a tag in GitHub.
Project is now available here: punk.kranzky.com
Next steps are:
- Get specs implemented for both projects
- Get code coverage report for both projects
- Implement Quasar in the template project for better UI
- Implement tenants, users, groups and sessions in the gem project
- Update the template project to showcase these
- Record a screencast of how these things work
- Officially release both repos
Once that is all done, I will use the template repo to reproduce the 2010 POC.
Divided the old RailRoad project into a public gem named Punk! together with a template repository named Let’s Punk!
Working to get these to a certain point of doneness before creating a new project from the template repository which I will use to re-create the 2010 prototype of HackTile.
First day of a new way of working.
Spent the weekend rearranging the home office, so I’ll be in a different mental space. It feels good.
Kids are on holidays. Dropped Eliza in to an all-day drama course, then went shopping with Jack (bought some new coffee… important) and finished off with a bike ride.
Finishing the Big Sur upgrade. PostgreSQL had died. Played around with OBS for recording screencasts. Works better than iShowU. Plan to create a HelloWorld app that smashes RailRoad together with PixiJS and Howler. Need to separate RR out into a GEM first. So cracking on that.
Someone logged a MegaHAL bug noting that it doesn’t work with Ruby3, which was released at XMAS, so fixing that too.
Starting to muck about with text generation in anticipation of NaNoGenMo this year. I trained a second-order Markov model on over 200 million words of data (three thousand or so texts from Project Gutenberg). Written in Ruby, using my native Sooth library, the entire process took 28 hours and resulted in a 674mb model file. Because Sooth uses a 32-bit context, I used a 16-bit dictionary of words, which I generated by stripping punctuation and capitalising words and then selecting the most frequent 64822 words (I wrote a script to count word frequencies and select word that occurred at least n times such that the result would contain fewer than 65536 words; I think n ended up being 27 or something like that).
I want to use the Markov model to generate sentences, but at the moment it does a rather poor job. Here are some examples:
<SENTENCE> SOME GENERATION OF ARCHITECT OF GREATNE WOULD COME FOR THE TERM OF ADORATION <SENTENCE>
<SENTENCE> YES SIR HE MADE HER A MAGICIAN HE EXCLAIMED <SENTENCE>
<SENTENCE> AND THIS THOU BUT <BLANK> WAS A PRAYER OVER THAT ALIEN ELEMENT <SENTENCE>
I should also note that <SENTENCE>
and <BLANK>
are special words, as are
<ERROR>
, <PARAGRAPH>
, <CHAPTER>
and <BOOK>
. And that I strip S
from
the ends of words, so that GREATNESS
becomes GREATNE
, in an effort to reduce
the number of unique words (as removing an S
will often turn a word from
plural to singular).
As an example, here is the opening chapter of “The Emerald City of Oz” by L. Frank Baum as presented to the inference algorithm, once parsed into the 16-bit dictionary:
<BOOK>
<SENTENCE> PERHAP I SHOULD ADMIT ON THE TITLE PAGE THAT THIS BOOK IS BY L FRANK BAUM AND HIS CORRESPONDENT FOR I HAVE USED MANY SUGGESTION CONVEYED TO ME IN LETTER FROM CHILDREN <SENTENCE>
<SENTENCE> ONCE ON A TIME I REALLY IMAGINED MYSELF AN AUTHOR OF FAIRY TALE BUT NOW I AM MERELY AN EDITOR OR PRIVATE SECRETARY FOR A HOST OF YOUNGSTER WHOSE IDEA I AM <BLANK> TO WEAVE INTO THE THREAD OF MY STORIE <SENTENCE>
<PARAGRAPH>
<SENTENCE> THESE IDEA ARE OFTEN CLEVER <SENTENCE>
<SENTENCE> THEY ARE ALSO LOGICAL AND INTERESTING <SENTENCE>
<SENTENCE> SO I HAVE USED THEM WHENEVER I COULD FIND AN OPPORTUNITY AND IT IS BUT JUST THAT I ACKNOWLEDGE MY INDEBTEDNE TO MY LITTLE FRIEND <SENTENCE>
<PARAGRAPH>
<SENTENCE> MY WHAT IMAGINATION THESE CHILDREN HAVE DEVELOPED <SENTENCE>
<SENTENCE> SOMETIME I AM FAIRLY ASTOUNDED BY THEIR DARING AND GENIU <SENTENCE>
<SENTENCE> THERE WILL BE NO LACK OF FAIRY TALE AUTHOR IN THE FUTURE I AM SURE <SENTENCE>
<SENTENCE> MY READER HAVE TOLD ME WHAT TO DO WITH DOROTHY AND AUNT EM AND UNCLE HENRY AND I HAVE OBEYED THEIR MANDATE <SENTENCE>
<SENTENCE> THEY HAVE ALSO GIVEN ME A VARIETY OF SUBJECT TO WRITE ABOUT IN THE FUTURE ENOUGH IN FACT TO KEEP ME BUSY FOR SOME TIME <SENTENCE>
<SENTENCE> I AM VERY PROUD OF THIS ALLIANCE <SENTENCE>
<SENTENCE> CHILDREN LOVE THESE STORIE BECAUSE CHILDREN HAVE HELPED TO CREATE THEM <SENTENCE>
<SENTENCE> MY READER KNOW WHAT THEY WANT AND REALIZE THAT I TRY TO PLEASE THEM <SENTENCE>
<SENTENCE> THE RESULT IS VERY SATISFACTORY TO THE PUBLISHER TO ME AND I AM QUITE SURE TO THE CHILDREN <SENTENCE>
<PARAGRAPH>
<SENTENCE> I HOPE MY DEAR IT WILL BE A LONG TIME BEFORE WE ARE OBLIGED TO DISSOLVE PARTNERSHIP <SENTENCE>
<CHAPTER>
So, how to generate a novel novel from this mess? Here are my thoughts:
- Generate a prototype sentence, which consists of a certain number of empty slots for words, with the length of the sentence statistically consistent with what has been observed in the past.
- Populate the slots with some candidate keywords that have high mutual information according to the previous two sentences.
- For the remainder of the slots, determine a list of words that could fill those slots, as constrained by the other known words in the prototype sentence.
- Fill the empty slots with the candidate words, preferring to fixate on a relevant keyword, and providing the choice is legal according to the Markov model.
- Generate hundreds of candidate sentences, and select the best according to some heuristic.
The heuristic for selecting the best generation will be a function of two factors; the average information of the generated sentence as measured by the Markov model, and the average mutual information of the generated sentence, as measured by a model that takes the previous few sentences into account, and possibly also the fixation words.
These fixation words should also be determined stochastically from data, by
observing words that tend to occur in clusters. I am trying here to identify
character names, locations, objects and so on that are pertinent to the story.
If the model generates a sentence containing the word SHERLOCK
, for instance,
then the mere presence of this word in the story should make it much more likely
to occur in the future. This is something to be figured out.
Sunny today after a couple of weeks of wind and rain and destructive storms. So out for a mid-morning ride. Conditions were good and I zoomed along. Had a fun moment when I, on the bike path, ringing my bell and weaving between elderly pedestrians, overtook a young couple, resplendent in their spandex (or is it lycra?) and peddling their expensive racing bikes along the road beside me.
They took one look at my K-Mart bike and I heard the boy whisper something to the girl. Soon, on an uphill section, they zoomed past me, just before I veered off into the back streets to take a short-cut through the nature reserve around the rive that I usually favour.
I started riding hard, much harder than usual, to beat them to the spot where our routes would converge. And I prevailed, arriving a good half-minute before them (the shortcut saves at least a few hundred metres of distance) where I dismounted to take a swig of my drink bottle and leiusirely finger my phone before nodding as they rode past with stunned expressions which turned to laughter in their wake.
We’ve been in lockdown for over 50 days now, and yesterday was the first time in seven weeks that I’ve really felt bored. I had an overdue task to work on for my day job, which I was finding it difficult to get into, it was a gorgeous day but we had nowhere to go and nobody to visit, and all other forms of entertainment seemed trite. I guess I was suffering from ennui.
I feel better now; I’m making great progress on my overdue task, the weather is worsening, and I’m looking forward to playing some games later today :P