Published on 31 July 2023.
When working on the video editor and writing about it, I keep talking about clips. But there is nothing called a clip in the source code. Today I will explore the idea of adding that as a concept and see what kind of functionality it will attract.
Here is the cleaned up diff from yesterday’s work on moving back to FFmpeg for proxy generation:
@@ -70,15 +70,16 @@ class FileSource(namedtuple("FileSource", "id,path,number_of_frames_at_project_f
proxy_tmp_path = proxy_spec.get_tmp_path(checksum)
if not os.path.exists(proxy_path):
proxy_spec.ensure_dir()
- p = mlt.Profile()
- p.from_producer(producer)
- proxy_spec.adjust_profile(p)
- producer = mlt.Producer(p, self.path)
- consumer = mlt.Consumer(p, "avformat")
- consumer.set("target", proxy_tmp_path)
- proxy_spec.adjust_consumer(consumer)
- run_consumer(consumer, producer, progress)
- self.create_producer(profile, proxy_tmp_path)
+ subprocess.check_call([
+ "ffmpeg",
+ "-y",
+ "-i", self.path,
+ "-vf", "yadif,scale=960:540",
+ "-q:v", "3",
+ "-vcodec", "mjpeg",
+ "-acodec", "pcm_s16le",
+ proxy_tmp_path
+ ])
os.rename(proxy_tmp_path, proxy_path)
return self.create_producer(profile, proxy_path)
Let’s commit that and see how we can refactor in this area.
$ ./make.py commit -m 'Use FFmpeg for proxy generation.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 2.959s
OK
[main 078c9f2] Use FFmpeg for proxy generation.
1 file changed, 10 insertions(+), 9 deletions(-)
The proxy generation code has this line:
checksum = md5(self.path)
Where md5
is a top level function.
Let’s extract a Clip
class and put the md5
method there. My idea is that a clip represents a file on disk that we can load into a FileSource
and display on our timeline.
Here it is:
class Clip:
def __init__(self, path):
self.path = path
def md5(self):
return subprocess.check_output(["md5sum", self.path])[:32].decode("ascii")
We can see again that we use an external program, md5sum
, to calculate the md5. I think I did that because the Python module for calculating md5 did not have a convenient function for calculating the sum on large files. As I noted previously, I’m fine with this. We should probably add a test though to make sure it works with the external program.
Anyway, the proxy code can now be written like this:
checksum = Clip(self.path).md5()
Perfect! Time to commit:
$ ./make.py commit -m 'Extract Clip and move the md5 function to it.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 2.966s
OK
[main 6488c05] Extract Clip and move the md5 function to it.
3 files changed, 12 insertions(+), 4 deletions(-)
create mode 100644 rlvideolib/domain/clip.py
When working on proxies before, we extracted this module:
rlvideolib.mlthelpers
First of all, its run_consumer
function is no longer used when we generate proxies with FFmpeg. Let’s remove it.
$ ./make.py commit -m 'Remove rlvideolib.mlthelpers.run_consumer since it is no longer used.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 2.974s
OK
[main aff41ef] Remove rlvideolib.mlthelpers.run_consumer since it is no longer used.
2 files changed, 9 deletions(-)
The only thing left now is this class:
class FileInfo:
def __init__(self, path):
self.path = path
def get_number_of_frames(self, profile):
return mlt.Producer(profile, self.path).get_playtime()
This looks a lot like it can be merged into our new Clip
. Let’s see where it is used.
I find an unused import of it and remove it.
$ ./make.py commit -m 'Remove unused import of FileInfo.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 2.967s
OK
[main dda14cf] Remove unused import of FileInfo.
1 file changed, 1 deletion(-)
I wonder if there is a tool that can automatically remove unused imports. Then we can include it in our test runner perhaps? Along with an auto formatter?
Anyway, the only other use is here:
class Transaction:
...
def add_clip(self, path, id=None):
source = FileSource(
id=id,
path=path,
number_of_frames_at_project_fps=FileInfo(
path
).get_number_of_frames(self.project.profile)
)
return self.add_source(source, source.number_of_frames_at_project_fps)
Here we are actually talking about adding a clip. I had forgotten about that. It would make more sense to create a Clip
then instead of a FileInfo
. Let’s add get_number_of_frames
to Clip
and then we can get rid of FileInfo
and rlvideolib.mlthelpers
completely.
$ ./make.py commit -m 'Move get_number_of_frames to Clip.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 2.955s
OK
[main c13f4cb] Move get_number_of_frames to Clip.
4 files changed, 7 insertions(+), 14 deletions(-)
delete mode 100644 rlvideolib/mlthelpers.py
We saw in an earlier devlog that the number of frames that we store is actually the number of frames at the current project FPS. I dig into the MLT source code and see that the out point seems to be calculated based on the FPS:
mlt_position frames = (mlt_position) lrint(format->duration * mlt_profile_fps(profile)
/ AV_TIME_BASE);
if (mlt_properties_get_position(properties, "out") <= 0)
mlt_properties_set_position(properties, "out", frames - 1);
if (mlt_properties_get_position(properties, "length") <= 0)
mlt_properties_set_position(properties, "length", frames);
I wonder if we can do the same calculation so that we don’t have to depend on MLT for get_number_of_frames
?
I’m not ready for that. Let’s rewrite
def get_number_of_frames(self, profile):
return mlt.Producer(profile, self.path).get_playtime()
to this
def calculate_length_at_fps(self, mlt_profile):
return mlt.Producer(mlt_profile, self.path).get_playtime()
to clarify a bit more what it is actually doing.
With this change, I think that the previous rename we did of FileSource.length
to FileSource.number_of_frames_at_project_fps
can be reverted. I think it makes sense to talk about a length in terms of frames. If we change the project FPS, all lengths have to be recalculated, and I think this makes sense. With the new calculate_length_at_fps
it is still clear that this length depends on the FPS I think.
Let’s commit:
$ ./make.py commit -m 'Rename Clip..get_number_of_frames to Clip..calculate_length_at_fps.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 3.457s
OK
[main 94de3c1] Rename Clip..get_number_of_frames to Clip..calculate_length_at_fps.
2 files changed, 3 insertions(+), 3 deletions(-)
$ ./make.py commit -m 'Rename FileSource.number_of_frames_at_project_fps to FileSource.length.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 2.952s
OK
[main 5bca102] Rename FileSource.number_of_frames_at_project_fps to FileSource.length.
2 files changed, 12 insertions(+), 12 deletions(-)
Now when Transaction.add_clip
reads like this
def add_clip(self, path, id=None):
source = FileSource(
id=id,
path=path,
length=Clip(
path
).calculate_length_at_fps(mlt_profile=self.project.profile)
)
return self.add_source(source, source.length)
I don’t think we need this comment anymore:
# NOTE: The length depends on the FPS of the project. Once the first
# FileSource is added to the project, the FPS of the project can not be
# changed.
I think the code above makes it clear enough that the length depends on the FPS of the project.
$ ./make.py commit -m 'Remove note about length/FPS since this is clear enough in the code now.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 2.955s
OK
[main ab10ec9] Remove note about length/FPS since this is clear enough in the code now.
1 file changed, 4 deletions(-)
Sometimes I feel bad for changing the code back and forth. But I don’t think I should feel bad. I think it’s a good thing to always make the code express our intentions as well as possible. If that means that names change back and forth, that’s ok.
What more functionality can the clip class attract? How about proxy generation?
A proxy clip is an alternative version of a clip. An alternative file on disk that represents the same thing but in a format that is easier to work with.
It makes sense.
Let’s extract it:
class Clip:
...
def generate_proxy(self, proxy_spec, progress):
# TODO: call progress
checksum = self.md5()
proxy_path = proxy_spec.get_path(checksum)
proxy_tmp_path = proxy_spec.get_tmp_path(checksum)
if not os.path.exists(proxy_path):
proxy_spec.ensure_dir()
subprocess.check_call([
"ffmpeg",
"-y",
"-i", self.path,
"-vf", "yadif,scale=960:540",
"-q:v", "3",
"-vcodec", "mjpeg",
"-acodec", "pcm_s16le",
proxy_tmp_path
])
os.rename(proxy_tmp_path, proxy_path)
return proxy_path
And FileSource.load_proxy
can be reduced to this:
def load_proxy(self, profile, proxy_spec, progress):
return self.create_producer(
profile,
Clip(self.path).generate_proxy(proxy_spec, progress)
)
This reads quite nicely I think. Let’s commit:
$ ./make.py commit -m 'Extract Clip.generate_proxy.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 2.929s
OK
[main cc85022] Extract Clip.generate_proxy.
2 files changed, 25 insertions(+), 18 deletions(-)
The ProxySpec
class currently lives in rlvideolib.domain.project
. I think a much better place would be rlvideolib.domain.clip
.
Let’s move it over.
$ ./make.py commit -m 'Move ProxySpec to rlvideolib.domain.clip.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 2.903s
OK
[main 36932f0] Move ProxySpec to rlvideolib.domain.clip.
2 files changed, 50 insertions(+), 49 deletions(-)
The proxy spec has a method for setting properties on a consumer:
def adjust_consumer(self, consumer):
consumer.set("vcodec", self.vcodec)
consumer.set("acodec", self.acodec)
consumer.set("qscale", self.qscale)
Now when we use FFmpeg for proxy generation, this is no longer used.
But we did hardcode some values for FFmpeg, let’s change adjust_consumer
to get_ffmpeg_arguments
:
def get_ffmpeg_arguments(self):
return [
"-vf", f"yadif,scale=-1:{self.height}",
"-q:v", self.qscale,
"-vcodec", self.vcodec,
"-acodec", self.acodec,
]
And use it like this:
def generate_proxy(self, proxy_spec, progress):
...
subprocess.check_call(
[
"ffmpeg",
"-y",
"-i", self.path,
]
+
proxy_spec.get_ffmpeg_arguments()
+
[
proxy_tmp_path
]
)
Commit:
$ ./make.py commit -m 'ProxySpec know how to generate FFmpeg arguments for conversion.'
...................................................
----------------------------------------------------------------------
Ran 51 tests in 3.412s
OK
[main 6579178] ProxySpec know how to generate FFmpeg arguments for conversion.
1 file changed, 20 insertions(+), 14 deletions(-)
I am quite happy with the refactoring that we did in this session. We finally have a concept of a clip in the source code and it turned out that it attracted a bit of functionality that previously was spread across the code.
The code base is now a little cleaner.
Why is that important?
Because clear thinking is clear writing and vice versa. It applies to code as well. If we have expressed our thinking clear in code, thinking about it is easier, and so making the next change is easier. And to think clearly about it, we need to write about it, refactor it, until it expresses our ideas clearly. It’s a circle that probably never ends but perhaps slows down as our ideas and the expression of them in code approach each other.
What is Rickard working on and thinking about right now?
Every month I write a newsletter about just that. You will get updates about my current projects and thoughts about programming, and also get a chance to hit reply and interact with me. Subscribe to it below.