Amazing paper, I re-read it in more detail today. It feels very rich, like almost a new field of study —- congratulations to the authors.
I’m ninjaing in here to ask a q — you point out in the checkerboard initial discussion that the 5(!) circuit game of life implementation shows bottom left to top right bias — very intriguing.
However, when you show larger versions of the circuit, and in all future demonstrations, the animations are top left to bottom right. Is this because you trained a different circuit, and it had a different bias, or because you forgot and rotated them differently, or some other reason? Either way, I’d recommend you at least mention it in the later sections (or rotate the graphs if that aligns with the science) since you rightly called it out in the first instance.
Author here. Thank you! You're seeing that correctly. The directional bias is the result of some initial symmetry breaking and likely random-seed dependent. The version that constructs the checkerboard from the top-right down was trained asynchronously, and the one from the bottom-left up was trained synchronously. The resulting circuits are different.
Adversarial images targeting an image classifier have been shown to be transferable to separately trained models (i.e., models with different weights or different architectures relative to the model for which the adversarial image was constructed to target).
I'm curious if the adversarial CA reprogramming techniques are similarly transferable. That is, do the adversarial CA and/or the adversarial perturbation matrix transfer to separate CAs (trained on the same task) with different weights or architectures than the original CA that was targeted?
Authors here. If you have any questions we'll do our best to answer them! Glad to see people find our work interesting thus far.
We also encourage anyone interested to play with the linked Google Colabs [1][2] and read the other articles in the Distill thread. In the Colabs you'll find a bunch more pre-trained textures as well as a workflow to train on your own images, plus some of the scaffolding to recreate figures.
This is the first I've ever read about neural cellular automata. I think I was able to pick up the broad strokes from context, but is there a good introductory resource for neural cellular automata?
Really impressive work - in seconds, I see so much both richness of ideas and potential!
And, as is so often the case, the really interesting work happens on the intersection of two fields - neural nets and cellular automata here. I've got tons of new reading to do now!
There's some recent work that involves NCAs in a 3D setting by Horibe et. al [1] and tweet [2]. Other work by Risi and collaborators is definitely worth checking out as well.
Great post, thanks! I saw Growing Neural Cellular Automata document you describe a strategy to get the model to learn attractor dynamics. I was kind of reminded of Deep Equilibrium Models (https://arxiv.org/abs/1909.01377).
Is there a relationship between these models and do you think these root finding and implicit differentiation techniques could be used to train Cellular Automata too?
The second cell looks a section title ("Imports and Notebook Utilities"), but contains the definitions of these functions and the imports. Run this cell and I suspect things should work.
In typical parlance today, "seminal" means "from which a bunch of important things have sprung" but I think there is an older definition which is simply "first".
Apologies, not my intention. I was also under the impression seminal could be used to mean ”first” in the succession of our works and this is what I had intended to communicate.
As a non-native but long-term speaker of English, I understand "seminal" as in "their seminal work" as "groundbreaking" (and better to be avoided when referring to one's own work). But slips of the pen are inevitable, so no harm done.
The article says "our NCA model contains 16 channels. The first three are visible RGB channels and the rest we treat as latent channels which are visible to adjacent cells during update steps, but excluded from loss functions."
Thanks for noticing. This is a typo stemming from early experiments. We started out with 16 channels, but switched to 12 channels of state when this worked just as well. I've submitted a correction.
This is an interesting idea and innovative approach - exciting stuff and congratulations on securing the 200k investment.
Some questions regarding the underlying technology:
You mention a "self-healing" grid. If one of these laser lines gets disrupted, say by a bird, or someone knocks over the laser receiver - how quickly does the PoP reroute traffic over a different path in the grid? Does it wait for a timeout, or is there some meta-data from the laser link to know when the interface is down?
One of the reasons you use the laser PoPs is that underground fibre is expensive. However - given that you have to pull overground fibre to every customer within the PoP - the total "length" of fibre in play will be roughly equal to the length of fibre required in a traditional underground installation. What's the advantage then? Why can't you pull all the fibre overground and bypass having laser PoPs completely?
Finally - given the massive investment in 4G and price wars in India, I would assume you are competing with mobile broadband routers. How is performance of such solutions in dense cities like Bengaluru? Is it unreliable or congested enough for people to want to pay for a dedicated fibre/laser internet connection?
OP here. Thank you for the questions. Yep, metadata and quick reconnection is the key for birds and such - we are able to do it almost instantaneously. Regarding 4G, our networks are choked and as many towers that can be placed, have. The spectrum is packed and noisy and dense commercial areas get horrible signal and speeds, the situation in rural areas is worse. And to make matters worse, less than 30% of the mobile towers themselves have fiber backhaul - so we hope to make a contribution there too. And finally, the core of the network can't be overhead fiber simply because it's not reliable enough to be the backbone because you're stringing it across trees, poles and over buildings. It's fine for the last mile to the customer's premises, but overhead fiber is not good enough for the core.
The idea I think is very useful from a UX perspective.
Some questions that immediately hit me:
How will you incentivise users to want to jump on a video chat with a random person? I think some people may be frightened by the thought of a stranger opening a video chat with them?
Currently the video opens full screen. Have you considered having a floating window to allow the user to continue interacting with the website?
Have you considered tracking interactions or even sharing the users screen to allow you, as a UX person, to see what the user is seeing (while talking with them).
Most of the user researchers & research ops people I've spoken with currently use some form of monetary incentive (credit for their own product or cash) to incentivise users to join their user research sessions. This process would work the same with Ribbon, although outside of the platform. In the next few days I'm adding functionality for the interviewer to be able to customise what message users see in the user recruitment pop-up, so that they can let users know about any compensation they will get for their time. In the mid-term another way to make running user research even easier is to help facilitate those payments directly on the platform.
The floating video is a really interesting idea! I've opted for making the script as light as possible, with all of the video chat functionality being hosted on Ribbon. This is to make sure the script doesn't slow down any website it is added to. Further down the line it will be interesting to explore how to make the user journey even more frictionless whilst still keeping website performance top of mind.
Screen sharing is another functionality that's proving really useful in remote user testing. I'm adding that over the coming month!