top of page
cover - vchar-2.png

VCHAR.AI

Creating an AI-powered image manipulation web app for effortless use by everyday users.

Vchar.ai, a tech startup, embarked on a mission to create a user-friendly web app for image manipulation with generative AI, allowing even non-designers to perform highly skilled image editing  tasks like model swapping or digital tattoo removal. As a UX/UI intern in the project, I collaborated with both front-end and back-end engineers to craft a distinctive UI and visual identity for the company. My role also entailed delving into prompt engineering and engaging with their clientele.

My Role:  End-to-end Product Design, Prompt Engineering and Visual Identity design

Timeline:  3 Months | May 1 - Jul 30, 2023

Team:  Me (UX UI Designer), Subhashish (Front-end), Saurav (Back-end)

​

Tools:  Automatic1111, Figma, Adobe Illustrator

OPPORTUNITY SPACE

Vchar.ai is a dynamic tech startup with a diverse clientele that ranges from photo studios in need of image editing  to retail websites seeking mannequin swaps and background removal - such as Untukit, Lenskart, RealReal, Soona, GreatnessWins, Nashermiles and Zenkaisports. As an intern, my initial focus was to thoroughly grasp client requirements and delve into prompt engineering, gaining insight into user needs for such a tool. I experimented with Automatic1111, a UI interface that employed Stable Diffusion technology. However, it's crucial to note that Vchar.ai's overarching vision extends beyond providing a service; they aspire to offer a product accessible to a broader audience, beyond trained designers and prompt engineers.

We have an opportunity to broaden our impact by harnessing our expertise in AI-driven image manipulation to develop an inclusive product that empowers a diverse user base, including individuals without specialized design or engineering backgrounds.

STAKEHOLDER NEEDS

Clintele Requirements

After engaging in direct conversations with the company's clients and collaborating with them on multiple projects, I diligently documented their requirements, pain points, and any additional insights. A significant portion of these clients lacked the resources to employ or access AI-integrated editing software. Their challenges can be summarized as follows:

​

1. Inadequate Existing Solutions:

  • Current editing software options are often unaffordable due to high pricing.

  • Many find the available software difficult to comprehend.

  • The current solutions often offer limited functionality and produce unsatisfactory results.

​

2. Efficiency and Speed: Time-consuming editing tasks involve a team of graphic designers and often they cannot afford to give that much time for a single project. Clients require a platform that can quickly generate high-quality editing results.

​

3. User-Friendly Interface:

  • Clients seek a platform with a minimalistic and user-friendly interface.

  • Ease of use is essential to cater to individuals who may not have a strong background in prompt engineering or AI technology.

Company's Requirements

The co-founders at Vchar.ai expressed a desire to go beyond serving their current client base and had several requirements in mind:

​

1. Customization: They want to provide editing tools that can be easily tailored to meet the specific requirements of each project. Customization is crucial to accommodating the unique needs of diverse clients.

​

2. Integration Capabilities: The platform should seamlessly integrate into their clients' existing workflows and systems. It should also be a platform where clients can store and manage their assets.

​

3. Accessibility: An easily accessible and user-friendly platform, even for individuals with limited technical skills. This accessibility ensures a wider user base.

 

​

​

EXPLORING PROMPT ENGINEERING

     1. AI generated art

2. Image manipulated to create nail art

3. AI generated backgrounds and Model Swap

I created these images using the WebUI Automatic1111 Stable Diffusion tool. This tool operates by inputting prompts, which guide the creation of output images. Positive prompts define what we desire in the images, while negative prompts help steer away from unwanted elements. Whether it's crafting AI art, altering models, or image manipulation, each of these can be seen as distinct processes, or workflows. However, once I discover an effective prompt for creating one type of image, I can often apply a similar approach to produce other images with only minor adjustments.

 

For instance, once I figure out the prompts that work for generating nails on a hand, I would only need to change the color, but the rest of the prompt can remain the same. 

​

5.jpg

Original Image

5.1red.png

Positive Prompt: Glossy, realistic nail, Red, reflective, natural

Negative Prompt: (monochrome:1.4), 3d, render, cg, anime, cartoon, illustrated, bad_hands_5

5.0dark'blue.png

Positive Prompt: Glossy, realistic nail, Blue, reflective, natural

Negative Prompt: (monochrome:1.4), 3d, render, cg, anime, cartoon, illustrated, bad_hands_5

This concept indicates the potential for a system where users articulate their preferences for an image, while the foundational prompt in the back-end largely remains consistent.

AUTOMATIC1111

My General Process 

  • ​Create a mask for the image to be manipulated

  • Upload the image and the mask in 'img2img' window

  • Enter prompts

  • Adjust width and height

  • Other Settings

  • ControlNet settings (if necessary)

  • Generate results

Challenged Encountered

1. Many technical jargons were used, which were unfamiliar to me as a beginner - terms like 'txt2txt' (which generates images using only prompts), 'img2img' (allowing image manipulation by uploading an image and masking areas for generation), and 'controlNet' (enabling user control over the basic image structure) often left us confused in the beginning.

 

2. The presence of numerous settings that were not required added to the complexity and confusion. 

 

3. The learning curve was notably steep, making it a challenging endeavor. Understanding how to effectively use the tool and master its capabilities required a substantial amount of time and effort.

COMPARITIVE ANALYSIS

The selection of competitors for this project was grounded in the fact that all these products are actively involved in AI image generation and represent some of the most successful tools in today's market. The companies compared are Runway ML, DAll-E by OpenAI, Clipdrop by Stability AI, and Playground AI.

BUILDING PERSONAS

Based on my previous research, which involved conversations with stakeholders and a comparative analysis, I have identified three distinct personas that collectively represent our user base. These personas effectively summarize the individuals or groups who are likely to utilize our platform and outline their specific motivations and use cases.

Kyle , 33 y.o
Manager at
a thrift store

Behaviors

  • Deadline-Oriented

  • Collaborative

  • Attention to Detail

Needs

  • Convert mannequin to real human images.

  • Ability to track, comment, approve, or request revisions for edited images.

  • Prefers a cost-effective solution.

Amanda , 26 y.o
Designer at a

Photography Studio

Behaviors

  • Creative-thinker

  • Tech-Savvy

  • Communicative

Needs

  • Access to user-friendly design software

  • Learning resources for new tools

  • A collaborative platform for easy sharing, feedback, and design collaboration.

Sameer , 19 y.o
College Student pursuing
Engineering

Behaviors

  • Curious about latest advancements in AI

  • Creative thinker

  • open to learning and adapting to new tech

Needs

  • Explore different AI image generation techniques

  • Ability to save and download work created

  • Access to user-friendly tutorials and guides

DESIGNING THE PROTOTYPES

The UI design process started with brainstorming, followed by creating screens. Before developers began their work, I prepared two versions of the application: one for the initial launch with only the necessary features (version 0) and another with advanced features for future development (Version 1+)

BRAINSTORMING FOR APP DESIGN

During my whiteboarding sessions, I determined the key components of the product's main screens. It became evident that the product should feature a homepage serving as a hub for various AI-powered image editing workflows. Recognizing that users would need to upload and store their images on the cloud, it also became apparent that an asset management system with space for image storage would be essential. Additionally, the product should include sections for user assistance and learning resources to help users effectively utilize the tools. Given the time constraints, I successfully designed two distinct workflows, an asset management system, and the homepage, including the visual design for the company.

        Whiteboard brainstorming session for figuring out web apps screens and functionalities

___

INITIAL DESIGN

The initial design encompassed all the editing steps, including masking, color adjustments, prompting, and image viewing. However, we decided not to proceed with this design because it appeared to be overwhelming, especially for new users. We aimed for a more streamlined and minimalist workflow that preserved all the essential elements and features of the editing process but presented them in a step-by-step and clutter-free manner.

FINAL DESIGN

1

Visual Design: Logo, Colors and  consistency throughout the website

During my internship at the company, I created the logo for "vchar," a term rooted in Hindi that represents 'thought.' Furthermore, I developed a brand guide to serve as a reference for the upcoming team members who will carry on with the website's development.

Logo depicts the brain to illustrate ‘thought’ or vchar

        Logo Options

___

2

Login Screens, Home and Workflows

After logging in or signing up, users are presented with a dashboard home page that highlights their ongoing projects, eliminating the need to search for them. The page also includes a "Learn More" section and a "Help Center," along with a comprehensive list of available workflows and tools within the app.

3

Workflow I
All-in-1 Studio

All-in-1 studio is a workflow that basically allows users to do whatever they want. Its not very restrictive and as long as users upload an image they want to edit or play around with they can use this workflow.

All-in-1 Studio > Selection

While working on "automatic1111," I identified three key steps for creating content on an AI image generation platform, which formed the basis of the "all-in-1 studio." These steps are referred to as "selection," "generate," and "canvas." 
"Selection" is essentially a simplified term for masking. Masking involves the process of isolating or modifying specific areas of an image. Users can choose to alter the masked area or the unmasked area. 
 
To simplify the masking process, three primary masking methods are provided, namely "unit selection," "stroke selection," and "brush selection." These methods are designed to empower users to mask objects, whether they are large or small, quickly or with intricate details, to suit their preferences.

Unit selection utilizes AI to automatically detect the surrounding area of your selection.

Brush selection allows you to paint an area and customize the stroke thickness as you desire.

Stroke selection identifies and selects the area around the movement of your mouse cursor.

All-in-1 Studio > Generate

The "generate" section is where you'll find prompts and various settings for AI image generation. Among these settings, the most crucial is the "positive prompt," which is why it's made into a floating button accessible in both the selection and generate spaces. This positive prompt serves as the space for text input, allowing you to describe what you want in an image. 
 

In the sidebar, you can also find additional settings, including option to exclude elements from the image (negative prompt), adjusting the generation strength, and controlling image through an inspiration (control net) and more, such keeping a fixed seed, which makes sure you get uniform creations with each time you generate with a fixed seed. After configuring these settings, when the user clicks "generate," they are presented with four options to choose from or the ability to regenerate.

All-in-1 Studio > Canvas

The "Canvas" is a space where users have control over the frame and dimensions of the image. They can either adjust the canvas by dragging and expanding it or input new dimensions manually using the sidebar. This feature is particularly useful if they want to enlarge the canvas to generate a background. 

4

Workflow II
Model Swap Studio

This is the second workflow I created, designed for model swapping. Users can upload the original picture and make selections for the desired changes, including ethnicity, facial expression, background, and more.

Collapsable sidebar

Save created images to assets or download to desktop

image.png
download-3.png
Frame 427320613-2.png

Select an ethnicity filter to choose from existing AI models, and then pick their expression and age.

image.png

Choose from pre-made backgrounds or generate one 

image.png
Frame 427320613-2.png
Frame 427320613-2.png

Adjust generation strength or make quick changes in the prompt if needed

5

Asset Management to optimize
the storage of images 

The "Assets" section serves as a storage area for all assets, including those uploaded by the user and images generated and saved within the app. The consolidation of these assets in one place led to the need for an effective way to manage all these images.

During discussions with the developers, we identified a challenge: creating folders for organizing assets was a complex endeavor due to constraints in time and resources. To address this issue, I introduced an efficient solution that involved utilizing tags and filters to enhance asset search and categorization.

 

Users now have multiple options for managing their assets. They can initiate a search by typing in the search bar at the top, select specific categories, or sort assets based on date or alphabet. On the right side, out of the three tabs, 'Organize' tab (represented by a tag icon) allow the users to selected or create a category and add tags to each asset. 

 

In addition, users can access detailed information about their assets from 'info' tab, including size, creation date, and more. We also recognized that this platform would serve both project managers and designers, where design often involves multiple iterations. The 'Comments' feature fosters collaborative discussions within a unified space.

TAKEAWAYS

Recognizing the significance of acquiring knowledge about a new concept, especially when it's central to a project, has been an eye-opener for me. Prompt engineering for generative AI was an entirely unfamiliar territory before, and this journey has taught me that the more I delve into its technical intricacies and become well-versed in the industry-specific language, the better I become at crafting user-friendly platforms.

​

Working at a startup has been a challenging but incredibly educational experience for me. I've had to learn a lot of new skills, and one of the most important things I've learned is how to balance my design ideas with what the development team can realistically create, especially when we have limited time and resources - like in a start up environment.

My experience at a startup emphasized the vital role of effective communication in a resource-limited setting and underscored the importance of adaptability when facing constant change.

bottom of page