Interview: Maui Mauricio on AI-Assisted Art Creation
Maui Mauricio is a creative producer and director from the Philippines. He loves experimenting with new technology, and that has led him to dabble with AI content generation.
This week, NiftyZone had the opportunity to speak to Maui about his experience, art-work and thoughts about AI’s role in art creation and how this technology can add to his professional work.
Hi Maui, can you let our readers know a little about yourself?
My name is Maui Mauricio, I’m a professional video editor, producer and occasional director from Manila, Philippines. My passion has always been new technology, partly because my career as a filmmaker began almost the same time digital filmmaking did. My college classmates and I started out by making music videos, a field which valued experimental production and post production techniques, and through that I discovered my love for unusual methods of shooting footage as well as editing the said footage.
So when and how did you get started in using AI for content creation?
I first got into AI generated imagery not with the intent of creating art but rather through its use as a tool in my field of work. I became fascinated with the development of AI-enhanced image and video upscaling, as most of our earlier work, shot in the then-standard resolution of 720×480, weren’t ageing very well and looked really terrible when played back on high resolution screens. For my first experiment, I decided to upscale one of my favorite past projects, a short film directed by one of my close friends back in college for his final thesis. I had to go back to the actual tapes themselves, upscale the recaptured footage, and reedit the entire film from scratch. I was really happy with the results, but it wasn’t really until much later on that the use of AI in generating images entered my awareness and interest.
My first use of AI tools was for upscaling and enhancing an old video project, and I was quite happy with the results.
I never really went into AI art with the actual intent of creating art – as I said above, for me it was a means to an end. I’m a big Darren Aronofsky fan, and I wanted to emulate what he did for Waking Life and A Scanner Darkly, and AI provided a potential avenue for that. AI tools were also starting to become introduced in the applications that I used daily: Adobe introduced Roto Brushing in After Effects, a subject separation tool which eventually became my most commonly used tool. I discovered Topaz Gigapixel and Video Enhance early on as well, and slowly my fascination for what AI can do grew.
But it wasn’t until the pandemic – that dark era where most of us were stuck at home with nothing to do – that I began to experiment with AI image generation. Runway.ML made their text-to-image model available to the public and I jumped on that pretty quickly. The images were only vaguely reminiscent of the prompts I provided, but I definitely saw the potential. I also tried Wombo at the start of the year, but I only really got into it (and work does get in the way most of the time haha) when Midjourney popped into the scene last July. That’s when I started doing lots of experiments, testing out different styles and subjects (just like everyone else, I suppose).
I’ve moved on to training my own models and doing animation since then, the public release of Stable Diffusion 1.4 being a catalyst for so many innovative uses of the fledgling technology, and my intent is to keep up with the trends and create newer and better pieces (and even help develop: I’m a patron/sort-of beta tester of a couple of developers’ efforts) until there comes a time that AI generated art becomes a more commonly used asset within the fields that I work in.
Training models, in particular, is something I’m very keenly interested in, as I see it a big step towards the creation of true novel AI art by way of unique personalities and art styles. I’m currently at the stage of testing out Stable Diffusion’s inherent racial biases when it comes to skin color and facial features. SD tends to handle white faces very well, as expected, so I’m partial towards testing non-caucasian faces (southeast asian ones in particular) to find ways of breaking that bias. It’s interesting to note that the model that I trained on my face has “broken” somewhat: Prompts calling for generic handsome men OR beautiful women are generating Asian faces now, and they don’t necessarily look like me.
Can you share some of your AI-assisted creations?
My preferences change every week as I discover something new with the technology. Here are a few of my current favorites:
The above is a piece generated using the model trained on my face. It’s part of a larger collection of works created using Visions of Chaos’ random prompt generator, which I modified using AI-produced lists of words to really mix things up. Really happy with those faces, as the prompt (“videos made of vet and yeah by Steel and Father high concept abstract, surreal, fashion photography studio, 35mm”) has no markers to indicate a specific look.
Another similarly generated series (but different art direction) is Sui Generis, trained using a friend’s face.
The “Lady in Tea Shop” is yet another series created using that workflow. I’m really into invoking ornate, abstract patterns in my prompts, as it creates a contrast to the cohesion of the subject that model training seems to enhance.
Training your model helps produce more cohesive imagery!Maui Mauricio
Next we have a short montage done in Deforum Stable Diffusion using the video input method. This brings me really close to my original goal, which is to apply the technology to the medium of visual narrative.
That doesn’t mean I don’t use the vanilla SD model as well. Whenever I want to break cohesion and produce abstract, almost hallucinatory compositions, I go default, do a big batch (500-1000), and play with my CFG scale.
Bea Benedicto is an influencer who I’ve had the pleasure of working with in the past and she was kind enough to allow me to use her face to train the model. I’m happy to say that these are near-perfect representations of her, despite the fact that the art direction is very different from her usual grams.
You have found some interesting ways of using these AI tools. Where do you get your inspiration and ideas from?
I’m restless and a tinkerer by nature, so my main motivator is novelty, or the drive to push the tech until it bends or breaks. I canceled my Midjourney after a being subscribed for just a couple of months, the reason being Stable Diffusion was just so much more flexible AND can produce video.
Stylistically I’d say I really do adore cohesion, or in other words I find joy in finding order in chaos (which is basically how diffusion models work, if you think about it). So the pieces I tend to like are those that highlight that contrast.
Can you share the gig you are using?
Dreambooth for training
Automatic1111’s GUI for images or Deforum Stable Deforum for images and animation, both running in Visions of Chaos (running locally)
CoDeformer for facial restoration
Photoshop or Inpaint/Outpaint in Automatic1111’s GUI for fine-tuning
Topaz Gigapixel for upscaling
Lightroom/Luminar AI for final color tweaks and enhancement
My process changes almost every week as new techniques get discovered (it really pays to learnfrom other enthusiasts) but right now my process usually begins with training a new model in the Dreambooth colab notebook, converting said model to a ckpt file, plugging it into Automatic1111’s SD GUI and mucking around until I get the results I want. The process definitely comes first, wherein most of my time is spent swapping out prompt words and modifying the different settings until I’m happy with the results. As I mentioned I tend to render in big batches, and out of about 1000 images I ultimately select around 50 that I would then clean up: CoDeformer on low for faces, Photoshop and/or inpainting for fixing finer detail like hands (I’m really more of a photo manipulation guy than illustrator so the latter is a big help), upscaling in Gigapixel and finally color correction in Luminar AI or Lightroom.
Any guess about the future of the role of AI in the Arts?
AI art is in its infancy, definitely, and things are only going to accelerate once people get a hold of NVIDIA’s newest cards, which would mean bigger, more complicated compositions produced at a much faster rate. I think that there will be an eventual fall-off of interest in terms of the general public, i.e. casual users who currently dominate the conversation (at the same time however, there will be a huge uptick in the number of professional artists who began with the use of AI). Concurrent to that there is a growing rise in interest with the technology in the advertising industry (I’m actually doing something AI-related for the Singapore market currently) and while I can’t speak for the art, advertising and film communities, I think that AI will be integrated into the workflow of the typical graphic artist/illustrator/animator/content creator just as much as, say, Adobe Photoshop, Blender, or Unreal engine once the tech really matures. It’s like the early days of digital filmmaking all over again, just much louder and much, much faster.
What’s next for you (in the context of AI Art)?
While I’m refining my current process (model training, prompt creation), I’m waiting for the next developments that I can pounce on and test out. We’re still in the early days of AI art and animation generation, and as much as I would want to fantasize about what we can do with the technology tomorrow, I find it more exciting to just stay in the moment, have a very open mind, and keep prompting.
Thank you Maui for your sharing!
Here are some links for readers who are interested to follow Maui and see more of his work.
IG (AI promptings): @mauitellsthemachines
IG (AI faces): @mauifacesthemachines