I reuse the image of the witch and send Veo this prompt:
Hand-held camera. The woman says threateningly "She will sleep forever or until you solve this puzzle". She holds out a folded piece of parchment and the camera zooms towards that. Hyper-realistic.
Brilliant! Just what I wanted, and it took me less than 10 minutes to make this clip.
The parchment contains a cryptogram. I'll use this image of a cryptogram that was purportedly thrown by famous French piraté 'La Buse' into a crowd of people who had gathered to watch his execution in 1720. The cryptogram may contain instructions to find his treasure which may be worth as much as 5 billion dollars in today's money. It's written using 'pigpen' symbols, a code which was apparently once popular with Freemasons. For centuries no one had a good interpretation of the cryptogram until... I solved it!!!!! Read about my discovery HERE.
As far as my Quest story goes, all that matters is that it is a cryptogram. I sent ChatGPT the image with this prompt and it did a good job.
Can you make an image where an old man's hands are holding this piece of paper. We see the paper from the perspective of the man as he looks down at it. The arms of the old man are in dark green sleeves. In the background there is out of focus grass. 3:2 format, Hyper-realistic.
I send that to Veo with this prompt:
Hand-held camera. The man runs the forefinger of his right hand over the characters while the camera zooms in. He says with a noble British accent "There's only one man smart enough to solve a cryptogram like this." Then he shouts "Call for Sir Nigel!" Hyper-realistic.
Since Sir Nigel is going to have the leading role in my video, of course, I want to make a reference image that I can use again and again as a base for making images in ChatGPT. So I send ChatGPT this not-very-recent image of myself and this prompt:
Can you make an image of this person standing wearing elegant medieval clothes. He is tall, about 40 years old. Plain white background. Hyper-realistic. 2:3 format.
That gives this result, which is okay, I suppose, though I could conceivably replace the face in this image with my real face from the photo with the blue background. I then give this prompt:
Now please make an image of that person wearing those clothes riding a magnificent horse that is crossing the jousting arena still with the same out of focus crowd in the background. Cloudy sky. Hyper realistic. The image should not be too dark and should have neutral white balance. 3:2 format.
Here is the result, which is okay but less of a likeness.
I would really like to be able to use a starting image and a final image for this clip. That failed before, but maybe Veo 3 had progressed in the last week? I use this prompt to ChatGPT to get this final image;
Can you make a similar image in 3:2 format with the horseman at the right of the image who is approaching the king who is at the left of the image, facing the horseman? They are about 3 meters apart. Hyper - realistic, as always.
The result that ChatGPT gives me is murky again, with warm tones. I fix that to get the second image.
But I'm tired of telling ChatGPT that I want normal brightness and neutral colour balance, so we have this discussion:
I said:
That image is very dark and murky. Is it possible to give you quantified instructions regarding brightness and white balance?
ChatGPT said:
Great question — here's how to address the image quality instructions:
✅ How to give quantified instructions for brightness & white balance
While I can't accept actual numerical values (like “brightness = 0.8”), you can guide the results more precisely using descriptive language like this:
Brightness
"Brighter overall lighting, like a photo taken under soft daylight."
"Avoid dark shadows or murky tones — midtones and highlights should be clearly visible."
"Expose the image as if it's a well-lit overcast afternoon."
White Balance
"Use a neutral white balance — no warm (yellow/orange) or cool (blue) color cast."
"Avoid any golden-hour or sepia tones. Keep skin tones and armor tones natural and balanced."
You can also say:
“Adjust brightness and color balance to mimic a professionally lit studio photo, but in an outdoor medieval setting.”
I will try including the last line from now on, though I don't know why the word 'medieval' needs to be included.
Before I upload the two Sir Nigel images to Veo, I need to uncrop (widen) them. But... my Pollo credits have run out until they are renewed tomorrow. I try using uncrop.org and that works but free uncropping comes with a watermark. I try using Runway, where I have an account, but that hugely changes the images of Sir Nigel and the king into this, a nice image which is unusable since we are no longer in the jousting arena.
I tell Runway:
Can you uncrop the image just by adding to the sides rather than making big changes to the original image?
It makes big changes again to the original part of the image, but perhaps it is usable? Let's try...
I try to send the two widened images to Veo but I strike a new problem! Veo refuses to upload the first image of me with this message: 'Our policies prohibit uploading of prominent people at this time.' Yoopee! I'm a prominent person at last! Or at least I look like one. But which prominent person would that be? I send the image to starbyface.com and it suggests that I look most like these four celebrities:
So if you want to call me Simon Cowell from now on, I will understand.
I ask ChatGPT to regenerate the image and get this. That's quite a crowd! I wonder what it would have cost to assemble such a crowd of extras in real life...
I upload that image successfully to Veo and try with this prompt:
The scene is set in medieval times. The camera tracks the horseman as he rides across the field. He stops three meters away from a man dressed as a king. As he rides at a trotting pace across the field a loud fanfare can be heard. Once the horsemen has stopped near the king he shouts with a noble British accent "Your majesty, I am at your service". Hyper-realistic. Cloudy skies.
Veo 3 still refuses to allow the use of a final image so I have to switch to Veo 2. Veo 2 cannot do sound so I remove the sound references in the prompt and run the prompt...
Veo 2 is unable to move between the starting image and the final image with a single movement so splits the video into three parts. I think I can use the middle part, though I will have to add sound (with a fanfare, naturally). It's a pity there aren't more shadows near the horse's hooves.
Elevenlabs can generate sound effects so I give it the prompts 'fanfare' and 'the cheering of a crowd of hundreds of people' and in each case one of the four sound effects that it proposes is usable. So, in CapCut, I add them into the video of Sir Nigel arriving and trim the unwanted beginning and end to get this short clip:
Rather than using the image I made earlier of the king with Sir Nigel, I download a frame from that video to use as the first frame of the next clip (is it sharp enough for that use?). I use that image and this prompt:
The scene is set in medieval times. The hand-held camera orbits to the right. A loud fanfare can be heard and the two men bow to one another. Then the horseman dismounts from the horse and says with a noble British accent "Your majesty, I am at your service". Hyper-realistic. Cloudy skies.
Unfortunately Veo 3 isn't capable of rendering Sir Nigel dismounting from the horse, so we just get an awkward twitch instead. Are his hemorrhoids bothering him? I could redo the clip, but I decide to keep it - I'll edit out the twitch.
But wait! Why am I using a Veo-generated voice when I can replace it with the clone of my own voice that I made at elevenlabs.io? Before I can change the voice in that clip I have to isolate it because otherwise the background sounds interfere with the voice changing. Once I have isolated and changed the voice into my own (and applied a 'no bass' filter to boost the treble) I export the result to CapCut where I combine the new voice with the original video, muting the original audio, to get this (later I'll have to try to put some crowd noise back in - I wish I had asked for no crowd noise in any clip since it's easy to add later).
I send this to ChatGPT: Can you make an image in which the king and the horseman are standing side by side and we are looking down over the shoulder of the horseman at the same parchment that we saw earlier. The horseman is holding the parchment in his left hand and pointing at the symbols with this right hand. There is faded green grass in the background.
As usual, the image was too murky and warm-toned so I had to fix that.
I ask Runway to uncrop the image to 16:9 but again it makes major changes, adding flowers and bushes, so the image is not usable.
But there is good news - I have received my monthly Pollo credit renewal so I can start using that again for uncropping - it does a much better job than Runway. Also, I can now use Midjourney from within Pollo so I won't have to renew my Midjourney subscription.
I widen the image in Pollo and send it to Veo with this prompt:
The scene is set in medieval times. The camera is hand held and slowly zooms in on the paper that the man is holding. The man on the right holding the paper moves his pointing finger over the paper and says with a noble British accent "It says the only cure for the princess is held by the Holy Snail who lives beyond the Great Swamp." The men look into one another's eyes. Hyper-realistic.
I replace the original voice with my voice clone and get...
That worked well! But, as so often, the camera did not zoom as instructed.
I will reuse the starting image of mounted Sir Nigel facing the king and this is the new prompt:
Hand-held camera. The man at the right says with a noble British accent in an excited and determined tone "My quest shall be to find the Holy Snail. The road is long but I will succeed!" Then the horse turns and he rides away into the distance. Cloudy weather. Hyper-realistic.
That came out okay, but it would have been more interesting with some camera movement and there is no crowd noise - maybe I'll add some later. Actually, since I've decided to replace Sir Nigel's original voice with my cloned voice (possibly a mistake due to the work involved) I had to replace the entire audio with my voice, meaning I lost the sound of the horse neighing and its steps on the ground in this case. It would be good if elevenlabs had an option to remove a voice in addition to isolating a voice, but I don't think it does. It occurs to me that maybe CapCut has that option and it does, for those who have paid for the Pro extra features. But I've learnt how to legally use the Pro features without paying...
Click HERE for the next clips.