Fourth Floor

View Original

AI Art with Jumoke Fernandez

DAVID PAUL MORRIS—BLOOMBERG/GETTY IMAGES. PHOTO ILLUSTRATION BY STEPHEN BLUE FOR TIME FOR KIDS

Generative AI refers to algorithms that can be used to create content, including audio, code, images, text, and video. They take raw data and learn to generate new outputs when prompted, rather than just recognising or analysing existing data.

This chart in an Nvidia explainer illustrates the applications:  

As shown above, one visual use case of generative AI is image generation, increasingly defined as Synthography, representing a new genre of image production.

Influences & Work

Jumoke’s work within this field is impressive. We began by speaking about her influences, pointing to her parents sparking her interest in the intersections of art and technology.

‘’My father introduced me to coding and my mother, an art teacher, exposed me to the traditional arts ‘’ 

She tells me about her impressive portfolio of work, which includes both Netflix and Meta.

For Netflix, she worked on a mobile game for the cartoon series Miraculous: Tales of Ladybug and Cat Noir.  

She also led the creative direction of a novel ‘Smile to Mint’ NFT for the Black Mirror Experience, inspired by Episode 1 of Series 3, Nosedive. For Meta, she worked with an agency to create generative 3D simulation video systems to design assets at scale. 

Interestingly, she tells me it is her work on the HP Omen project that stands out the most. This project involved integrating Japanese pop aesthetics into Manga comics promoting HP products.  

‘’This project was particularly notable for its innovative use of generative AI to blend traditional manga art forms with modern digital techniques, pushing the boundaries of what can be achieved in digital storytelling. It was the first commercial/advertorial comic book using AI - as far as I am aware!’ 

Generative AI Tooling

In terms of the tooling she uses to make this work, she mentions that she uses Midjourney, Stable Diffusion, Runaway and Adobe’s Firefly. 

Midjourney and Stable Diffusion are currently the most widely used text-image models. One comparison finds that Midjourney is best for creating artistic, visually compelling images, whereas Stable Diffusion is best for extensive customisation and technical control over image production. 

‘’I encourage everyone to also explore outside of the go-to tools and workflows, especially if you’re working with these tools within an artistic practice. There are many interesting lesser-known tools and pipelines out there which can be useful to explore and sometimes produce surprising results.’’

Outside of text-image, I’m interested in what other forms of generative AI she uses to support her wider creative processes. She emphasises the utility of Whybot, a complimentary text-to-text tool for Chat GPT. 

‘’Whybot is great if you want to create logical systems or mindmap brainstorming based on your input. I would recommend trying it out using a concept for a pitch or project, and seeing how it does with generating some additional food for thought or reasoning behind the ‘why’ for your concept. You could try a prompt like I’m working on a ___ for a ___, with the theme of ____, create 10 concepts for ____ that appeal to ____.

She also recommends Perplexity which she uses for quick queries. This is an alternative to traditional search engines, directly allowing for concise, accurate answers with references to wide source-level data. 

I am keen to speak to her about sound generation as a generative AI application. Specifically, I find it incredible that this sound output can not only be the product of text-audio models, but image-audio too. Jumoke tells me that she has used these image-audio models herself, for sound effects for videos, as well as for a web app that allows users to take photos and generate songs from it. She recommends this hugging face space to compare image-to-sound models.

 ‘’Ultimately, since all of these inputs and outputs are ‘language’, you can have anything-to-anything.’’ 

Ethics & Risks 

It is important to speak on pertinent concerns in this emerging field, agrees Jumoke, who herself emphasises the importance of inclusivity and ethics in generative AI and Synthography. At some of the workshops that she co-lectured at the New Centre for Research & Practice, some of these issues were highlighted.

One area of risk comes from Deepfakes, defined as a video, image, or sound recording replacing someone’s face or voice with that of someone else in a way that appears real. Jumoke herself emphasises the hope and necessity that increasing the usage of AI will continue to engender cross-institutional fortitude against attacks. And this is a pertinent policy-making topic. The EU AI Act, passed in March of this year, is the world's first comprehensive AI law. Article 52 (3) of the act states that, instead of an outright ban, transparency is required from deepfake creators, where deepfake creators must disclose how their work has been altered.

The UK Online Safety Act takes a different approach, where shared ‘intimate’ AI-generated images of someone without consent to cause distress is now illegal. Following on, the UK Criminal Justice Bill has a new law which adds that creating or designing deepfakes to cause alarm, distress, or humiliation, is now also a criminal offence. The US has released an AI Bill of Rights Blueprint that acts as a guide for the US, although no federal laws in the US prohibit the creation or sharing of deepfakes. There is growing appetite with the January 2024 No AI FRAUD Act proposal.

Debates in policymaking and pressure group circles pertain to whether the subjectivity of the Acts will entail that prosecution can be manoeuvred too easily.

Another concern voiced is that workers in middle and lower-income countries have been traumatised by moderating AI. In a 2023 Time investigation and following Guardian article, the trauma that Kenyan workers have suffered is evidenced. Workers accused the at-the-time outsourced content moderator for Open AI’s Chat GPT, Sama, of not warning them of the texts and images they would have to review, which included that of self-harm, murder, child abuse and bestiality.

Copyright is another major ethical concern. There have been numerous strikes, litigations, and articles accusing tech firms of stealing the work of artists to train their models. In the UK, the Intellectual Property Office, tech firms, and creative organisations have recently failed to reach a consensus on clear guidelines for training AI models on copyrighted material, reinforcing fears that creative work will continue to be copied without permission or remuneration. 

As it stands, Stable Diffusion, Midjourney, Dall-e, and Chat GPT, are trained on billions of images scraped off the web, with many allegedly copyrighted. 

Currently, the case of Scarlett Johansen vs Open AI is in the headlines. She voiced AI virtual assistant Samantha in Spike Jonze’s Her. Sam Altman is said to have made an offer for her to provide the voice for the Chat GPT Chatbot. She declined, but open AI pushed forward with Sky, a voice which is eerily similar to that of Samantha. Legal notices from her team led to this being removed.

Conversely, there is a growing list of companies working with open-source images. Adobe’s Firefly is a system trained on images from the Creative Commons, Wikimedia, Flickr Commons and 300 million pictures and videos in Adobe Stock and public domain. Adobe stresses this work is safe for commercial use and that creators whose work is used will qualify for payments. However, recent reports suggest that its model was trained on rival Midjourney. 

There is also Gen AI by Getty Images, powered by Nvidia, trained on the Getty library. Nvidia’s Picasso is another player, trained on Getty, Shutterstock, and Adobe, with Nvidia also planning on royalties.

These are evolving cases, with Getty Images themselves currently taking legal action against Stability AI, creators of Stable Diffusion, for using millions of its copyrighted images.

It is clear however that it would not be just or ethical for creatives to have their work absorbed into training models against their consent. In the meantime, synthographers like Jumoke will have to navigate these uncertainties until transparent and binding laws are implemented. 

Forward Trajectory

Notwithstanding, Jumoke journey has been incredible, and she has exciting work on the horizon. Her studio JUMOKE LTD has just joined the Nvidia Inception program, a free program designed to help startups evolve through access to elite-tier technology and opportunities. 

She is also working with a mobile game studio creating hybrid workflows combining AI and traditional 2D software like Animator and Photoshop. Her work is also now being extended to the beauty and lifestyle industries. 

“I’m looking forward to seeing what this year brings to this intersection of AI and Creative Practice which I operate in, excited to make new connections and build meaningful work.”