Uploading an MSG document creates an unnecessary/duplicate version2store bin fie using disk space
When uploading an email (msg file) with attachment, alfresco usually creates a bin file relating to each workspace. these bin files relate to each node i.e. messag ebody node, attachment node, rendition node.
usually these nodes are all worspace nodes, but we found that there was a node with the same size as the attachment created in alf_data/Contentstore. for example in the below contentstore, you see two bin files with the same size of 178176:
~rw-r---- 1 alfresco alfresco 327 Jan 11 15:15 66c7f741-a91f-4dfd-9d7d-d72606741837.bin
rw-r---- 1 alfresco alfresco 64211 Jan 11 15:15 993e1952-18f3-477b-baf6-4f8eb914b0c7.bin
rw-r---- 1 alfresco alfresco 178176 Jan 11 15:15 e0743265-aaa8-4a5c-8f7f-43f6a97c7399.bin*
rw-r---- 1 alfresco alfresco 178176 Jan 11 15:15 f17593f1-64e9-460b-8497-5a47ef096988.bin
rw-r---- 1 alfresco alfresco 10089 Jan 11 15:15 f376638b-7c9f-4fc0-b67a-ee1ce2e9318d.bin
rw-r---- 1 alfresco alfresco 169598 Jan 11 15:15 faa28850-6886-46fc-b2e0-d848852e2f3b.bin~
Looking into this, we found that the Started bin file relates to a node in version store.
We used this query to find the relevant node for the started bin file:
~Select s.identifier,n.uuid,cu.content_url,cu.content_size,* from alf_node as n
join alf_node_properties as np on np.node_id=n.id
join alf_content_data as cd on cd.id=np.long_value
join alf_content_url as cu on cu.id=cd.content_url_id
join alf_store as s on s.id=n.store_id
where cu.content_url like '%e0743265-aaa8-4a5c-8f7f-43f6a97c7399.bin%'
Customer has reported this as an issue as it's using unnecessary disk space.
We believe this is an issue, because when the outlook integration plug in is used to save the file in alfresco from emails, this version2store node is not showing in the content store.
Both normal upload and upload via the OI, are using the same transformers as we have enabled the option to do this in share> outlook integration settings
_~Automatically convert emails (EML, MSG) uploaded using Share, CIFS, WebDAV, FTP, NFS: Enabled~_
1)Install ACS 6.2.2 with OI 2.7.0
2) in share > admin tool > outlook >integration settings> set the above setting so msg uploads to share are atomically converted
3) upload attached mg file using the normal upload button in share and open to view in share ( see image1)
4)go to alf_data> contentstore >navigate throught the time tree and find all bin files relevant to this upload like I listed in summary section
You see two bin files with the same size using up disk space
you should not see any duplicate bin files
upload via outlook integration is working as expected - the preview of the uploaded file looks exactly as image1 above.
running the above db query, for each of the bin files with the same size, we found the a noce from version2Store is showing for this bin file: 0743265-aaa8-4a5c-8f7f-43f6a97c7399.bin
found the node in nodebrowser using uuid:
Outlook integration 2.7.0
It is odd that a version is being created but there is no version listed anywhere in Share. Believe this is being done by the Outlook integration.