Skip to content

TikaFix ---Contribution by jerni-zu393#185

Open
jerni-zu393 wants to merge 2 commits intoapache:masterfrom
jerni-zu393:master
Open

TikaFix ---Contribution by jerni-zu393#185
jerni-zu393 wants to merge 2 commits intoapache:masterfrom
jerni-zu393:master

Conversation

@jerni-zu393
Copy link
Copy Markdown

I have added the magic bits for three files (*.keystore/.jks , *.apk , *.aac). It can be detect the file types even the files should not have "."extensions .

I have attached here below the sample files for testing purpose .

keystore.tar.gz

apktest.zip
aactest.zip

jerni-zu393 added 2 commits June 15, 2017 12:13
…, *.aac). It can be detect the file types even the files should not have "."extensions .
…, *.aac). It can be detect the file types even the files should not have "."extensions .
@jerni-zu393 jerni-zu393 changed the title TikaFix ---Contribution ny jerni-zu393 TikaFix ---Contribution by jerni-zu393 Jun 15, 2017
@jerni-zu393
Copy link
Copy Markdown
Author

Any updates ?

@tballison
Copy link
Copy Markdown
Contributor

@Gagravarr any objections? Would be useful to open a ticket on our JIRA to track to this. Also would be helpful to add unit tests with files that you've provided. Thank you!

@Gagravarr
Copy link
Copy Markdown
Contributor

The keystore one should probably go further down the file, so it's in alphabetical order as the others are

For APC files, do we know if they always store the entries in that specific order? Or if it might change? Currently ZipContainerDetector only requires AndroidManifest.xml to be present, do we want to mirror that or keep your wider list of files required?

The AAC file magic with ID3 at the front doesn't necessarily look right to me - won't most MP3 files with ID3 tags in first incorrectly match on this too?

<match value="P" type="string" offset="2">
<match value="libfaac" type="string" offset="11"/>
</match>
</magic>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this detection coming from? I cannot find it the same in the file util nor looking around.. my AAC file starts with FFF1 which is kind of indicated in https://www.garykessler.net/library/file_sigs.html as an example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants