Sidecar support #62

bpatrik · 2021-05-11T10:56:49Z

Is there any plan to officially supporting sidecar files?

There are multiple type of sidecar files: XMP, exif, etc. These files live next to the original media file and should have the same format as the data in a jpg itself. It is a good way to attach metadata to e.g: videos.

Here is how exiftool deals with it: https://exiftool.org/metafiles.html
Here is an other lib for xmp sidecar reading that has some test cases: https://github.com/PhilipF5/xmp-sidecar/tree/master/test

I also generated some:
I used exiftool with:
exiftool.exe -xmp -b IMG_5910.jpg > IMG_5910.xmp
exiftool.exe -exif-b IMG_5910.jpg > IMG_5910.exif
The output:
IMG_5910.jpg

IMG_5910.exif: IMG_5910.exif.txt
IMG_5910.xmp: IMG_5910.xmp.txt

I dropped the .exif to https://mutiny.cz/exifr/ and seems to work without an issue. .xmp unfortunately not.

Context of the feature request:
We are currently replacing exif processing (bpatrik/pigallery2#277) for pigallery2. Your nice project looks really promising: bpatrik/pigallery2#277 (comment). We would like a lib that can support sidecars too bpatrik/pigallery2#156. Unfortunately, our only option is exiftool currently: bpatrik/pigallery2#294 (comment)

The text was updated successfully, but these errors were encountered:

MikeKovarik · 2021-05-13T21:04:36Z

Hello. I coincidentally wanted to implement this for a long time. It is both simple and difficult :D
Let me elaborate.

The parsers for XMP, ICC, etc... are already implemented and could be easily applied to any sidecar file. But...

Reading image files is easy because the format is deterministic. Each format starts with a signature (magic bytes). Then there's a structure of segments, each beginning with another signature - denoting what's wrapped inside that segment.

With sidecars, there's no such metadata I could use for determining the format and thus picking the correct parser. Sidecar just starts with the raw data right away. I can only try each parser, wrap it in a try/catch and wait until it fails to try another.
This is absolutely valid solution, but since exifr prides itself with performance, I wouldn't want to add this (in my eyes) inefficient code to parse().

I've been thinking and prototyping for the past two days and came up with function parseSidecar(file[, options][, formatName]). Or something similar. What do you think? Would this work for you?
It'd be just for now, until some better solution materializes. But let's be honest, there's nothing more permanent than a temporary solution :D

kagahd · 2021-05-13T22:47:30Z

@MikeKovarik, why not just rely on the file name extension, e.g. .xmp or .exif ?
I mean, if somebody creates a sidecar file with a wrong file extension, it's not your fault 😏

bpatrik · 2021-05-14T08:15:07Z

@MikeKovarik thank for the quick response!

From my perspective I would also prefer the best possible performance. (We put lately quite som effort to compare multiple exif readers: bpatrik/pigallery2#277 (comment) ).

Options I see here is:

Try/catch everything (not the fastest)
Can read the first few bites a guess the best reader (eg.: if it starts with < then its probably an xmp). There is a chance that you make a bad assumption as this approach would not be a standard way of inferring the data structure. -> this can cause confusion -> lot of github tickets :)
I also think, using extension is a good option. If the data is not parseable based on the extension it should fail. I would keep the format an input option, so users (developers) can force a parser. It would be fast and no magic involved.

You can also take a look at https://exiftool.org/metafiles.html and try to understand how that works on different extensions. The only thing is that exiftool is really slow. During that time it can try all possible parsers twice.

Update: please not that my uploaded files end in .txt only because github does not support xmp and exif extensions.

MikeKovarik · 2021-05-14T15:04:39Z

Hello. I've been writing up proper response that I'll post later. In meantime. I gave shot to the "try infer from filename if type isn't explicitly defined" way with try/catch fallback. But since I'm on a train and about to leave, there's no time for chit chat. I wanted to at least give you the work in progress code if you'd like to take a look at it before I speak my peace :D. So far my weekend seems busy.

For now, you can take a look at
https://github.com/MikeKovarik/exifr/blob/master/debug/sidecar.mjs
and
20f36ae

bpatrik · 2021-05-15T08:50:41Z

I had a look.

I think (without testing) the majority of the time is spent on reading the file and not parsing the result.
So doing some try-catches should not have too high cost. I would do some quick benchmarks on some general photos. (generate all possible type of sidecars for like 100 photos with exiftool and parse them with and without the try-catch trick)

I would still call out (or just simple not support) the try-catch trick. Inferring the datatype has extra cost. I think user usually knows the data they want to feed your lib, so providing the type should be free for them.

Unrelated: I can recommend using .editorconfig and lint (my tslint) in your project. It made my code way more coherent and caught some bad practices, anti-patterns. (like use const if possible, etc.). It is easy to set up and modern IDE-s can auto format the code to match the linting expectations. (Please take it with a grain of salt, I'm not a web developer in my dayjob)

bpatrik · 2021-05-27T21:28:02Z

Is the xmp parser function public, so that could be used in the meantime for parsing xmp sidecars, while the exifr.sidecar gets to a release?

MikeKovarik · 2021-06-01T07:01:32Z

Hello again after a long time. I'm sorry for the delayed response. I'm was looking for a new job and the multi-step hiring process with coding homeworks was putting strain on me.

I'm not so fond of solely relying on file the extension because it isn't bulletproof.
Even though the .txt at the end of IMG_5910.xmp.txt was added github. Someone else can not knowingly download the file and run with it. Or there may not be any file extension at all. I know these are edge cases but ideally I'd like there to be as little friction as possible. And like you mentioned, issues can arise from edge cases and missing out overly complex docs.

That's why there's something i liked about try/catching. Though absolutely not ideal. Performance: it's not so bad anymore. Thanks to async/await syntax it has been reworked in V8 some two years ago. It now runs at the same speed as .then/.catch. But of course, there's always some time spent feeding the data into parser while waiting for it to crash.

But I'd like to keep it as a last resort. As is in the last commit i presented. You can specify the type explicitly, or it will be infered from the filename extension (though the condition on line 13 should also contain && segType === undefined). Then I'll be adding some rudimentary XMP check at line 22 - "does the first few bytes include <?x" or something like that. And if all else fails, then there will be the try/catch fallback.
TLDR: it can be very efficient and you can avoid guessing.

Regarding the last comment: it's in master but not yet published. I thought you'd want to clone master and play with it, but I didn't release it as it wasn't the final implementation.
If the proposal is fine by you then there's not much more work to be done and I could release it soon.

Regarding linting: thanks for the note. I'm thinking about it from time to time. Bit i have a mixed feelings about it from other projects. It can be pain to setup the way I'd like it to work :D perhaps some day

MikeKovarik · 2021-06-28T10:28:15Z

Hello @bpatrik . I forgot to mention that 7.1.0 with sidecar support was released last week.

bpatrik · 2021-07-07T20:07:01Z

That is excellent news! Thank you! (sorry I also got carried away a bit lately, I hope you get the position, that you were wishing for :) )

bpatrik mentioned this issue May 11, 2021

Replace metadata libs with exiftool bpatrik/pigallery2#294

Closed

bpatrik mentioned this issue May 14, 2021

Support XMP sidecar files bpatrik/pigallery2#156

Open

MikeKovarik mentioned this issue Jun 11, 2021

Urls with query string are not working correctly. This fixes it. #65

Merged

MikeKovarik closed this as completed Jun 28, 2021

bpatrik mentioned this issue Jul 7, 2021

Clean up MetadataLoader.ts bpatrik/pigallery2#277

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sidecar support #62

Sidecar support #62

bpatrik commented May 11, 2021

MikeKovarik commented May 13, 2021

kagahd commented May 13, 2021 •

edited

Loading

bpatrik commented May 14, 2021 •

edited

Loading

MikeKovarik commented May 14, 2021 •

edited

Loading

bpatrik commented May 15, 2021

bpatrik commented May 27, 2021

MikeKovarik commented Jun 1, 2021

MikeKovarik commented Jun 28, 2021

bpatrik commented Jul 7, 2021

Sidecar support #62

Sidecar support #62

Comments

bpatrik commented May 11, 2021

MikeKovarik commented May 13, 2021

kagahd commented May 13, 2021 • edited Loading

bpatrik commented May 14, 2021 • edited Loading

MikeKovarik commented May 14, 2021 • edited Loading

bpatrik commented May 15, 2021

bpatrik commented May 27, 2021

MikeKovarik commented Jun 1, 2021

MikeKovarik commented Jun 28, 2021

bpatrik commented Jul 7, 2021

kagahd commented May 13, 2021 •

edited

Loading

bpatrik commented May 14, 2021 •

edited

Loading

MikeKovarik commented May 14, 2021 •

edited

Loading