{"id":19305,"date":"2024-11-12T19:12:33","date_gmt":"2024-11-12T13:42:33","guid":{"rendered":"https:\/\/opstree.com\/blog\/?p=19305"},"modified":"2025-11-21T16:19:14","modified_gmt":"2025-11-21T10:49:14","slug":"understanding-cow-and-mor-in-apache-hudi-choosing-the-right-storage-strategy","status":"publish","type":"post","link":"https:\/\/opstree.com\/blog\/2024\/11\/12\/understanding-cow-and-mor-in-apache-hudi-choosing-the-right-storage-strategy\/","title":{"rendered":"Understanding COW and MOR in Apache Hudi: Choosing the Right Storage Strategy\u00a0"},"content":{"rendered":"<p><span data-contrast=\"auto\">Apache Hudi (Hadoop Upserts Deletes and Incrementals) is a powerful framework designed for managing large datasets on cloud storage systems, enabling efficient data ingestion, storage, and retrieval. One of the key features of Hudi is its support for two distinct storage types: Copy-On-Write (COW) and Merge-On-Read (MOR). Each of these storage strategies has unique characteristics and serves different use cases. In this blog, we will explore COW and MOR.<\/span><span data-ccp-props=\"{&quot;335551550&quot;:1,&quot;335551620&quot;:1}\">\u00a0<\/span><!--more--><\/p>\r\n<p aria-level=\"2\"><b><span data-contrast=\"none\">Prerequisites<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:299,&quot;335559739&quot;:299}\">\u00a0<\/span><\/p>\r\n<p><span data-contrast=\"auto\">Before you begin, ensure you have the following installed on your local machine:<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/p>\r\n<ul>\r\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"4\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Docker<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<ul>\r\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"4\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"2\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Docker Compose<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<p aria-level=\"2\"><b><span data-contrast=\"none\">Local Setup<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:299,&quot;335559739&quot;:299}\">\u00a0<\/span><\/p>\r\n<p><span data-contrast=\"auto\">To set up Apache Hudi locally, follow these steps:<\/span><span data-contrast=\"none\">\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/p>\r\n<ol>\r\n<li data-leveltext=\"%1.\" data-font=\"Aptos\" data-listid=\"5\" data-list-defn-props=\"{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\"><span data-contrast=\"auto\">Clone the Repository:<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/li>\r\n<\/ol>\r\n<blockquote><span data-contrast=\"auto\">git clone https:\/\/github.com\/dnisha\/hudi-on-localhost.git<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559731&quot;:720,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span> <span data-contrast=\"auto\">cd hudi-on-localhost<\/span><\/blockquote>\r\n<ol>\r\n<li data-leveltext=\"%1.\" data-font=\"\" data-listid=\"5\" data-list-defn-props=\"{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"2\" data-aria-level=\"1\"><span data-contrast=\"auto\">Start Docker Compose:<\/span><\/li>\r\n<\/ol>\r\n<blockquote><span data-contrast=\"auto\">docker-compose up -d<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:1440,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/blockquote>\r\n<ol>\r\n<li data-leveltext=\"%1.\" data-font=\"\" data-listid=\"5\" data-list-defn-props=\"{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"3\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Access the Notebooks:<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/li>\r\n<\/ol>\r\n<ul>\r\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"5\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:1440,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"2\"><span data-contrast=\"auto\">Open your browser and navigate to <\/span><span data-contrast=\"none\">http:\/\/localhost:8888<\/span><span data-contrast=\"auto\"> for the Jupyter Notebook.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<ul>\r\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"5\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:1440,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"2\" data-aria-level=\"2\"><span data-contrast=\"auto\">Also, open <\/span><span data-contrast=\"none\">http:\/\/localhost:9001\/login<\/span><span data-contrast=\"auto\"> for MinIO.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<p><b><span data-contrast=\"auto\">Username:<\/span><\/b><span data-contrast=\"auto\"> minioadmin<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559685&quot;:2160,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span> <b><span data-contrast=\"auto\">Password:<\/span><\/b><span data-contrast=\"auto\"> minioadmin<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559685&quot;:2160,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/p>\r\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-19418 size-full\" src=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-18-59-07.png\" alt=\"\" width=\"627\" height=\"325\" srcset=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-18-59-07.png 627w, https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-18-59-07-300x156.png 300w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\r\n<h2 aria-level=\"2\"><b><span data-contrast=\"none\">What is Copy-On-Write (COW)?<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:299,&quot;335559739&quot;:299}\">\u00a0<\/span><\/h2>\r\n<p><span data-contrast=\"auto\">Copy-On-Write (COW) is a storage type in Apache Hudi that allows for atomic write operations. When data is updated or inserted:<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/p>\r\n<ul>\r\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"9\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\"><span data-contrast=\"auto\">Hudi creates a new version of the entire data file.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<ul>\r\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"9\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"2\" data-aria-level=\"1\"><span data-contrast=\"auto\">The existing data file remains unchanged until the new file is successfully written.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<p><span data-contrast=\"auto\">This ensures that the operation is atomic, meaning it either completely succeeds or fails without partial updates.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/p>\r\n<h2 aria-level=\"2\"><b><span data-contrast=\"none\">Steps to Evaluate COW<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:299,&quot;335559739&quot;:299}\">\u00a0<\/span><\/h2>\r\n<ol>\r\n<li data-leveltext=\"%1.\" data-font=\"Aptos\" data-listid=\"8\" data-list-defn-props=\"{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\">\r\n<h3><b><span data-contrast=\"auto\">Open the Notebook:<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/h3>\r\n<\/li>\r\n<\/ol>\r\n<ul>\r\n<li data-leveltext=\"%2.\" data-font=\"Aptos\" data-listid=\"8\" data-list-defn-props=\"{&quot;335552541&quot;:0,&quot;335559685&quot;:1440,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,4],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%2.&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"2\"><span data-contrast=\"auto\">In your browser, navigate to <\/span><i><span data-contrast=\"auto\">hudi_cow_evaluation.ipynb.<\/span><\/i><\/li>\r\n<\/ul>\r\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-19419 size-full\" src=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-00-19.png\" alt=\"\" width=\"624\" height=\"330\" srcset=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-00-19.png 624w, https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-00-19-300x159.png 300w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\r\n<h3><b><span data-contrast=\"auto\">2. Run Configuration Code:<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/h3>\r\n<ul>\r\n<li><span data-contrast=\"auto\">Execute all configuration-related code in the notebook.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\r\n<li><span data-contrast=\"auto\">Ensure you specify the COPY_ON_WRITE table type, as shown in the provided image.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-19421 size-full\" src=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-02-25.png\" alt=\"\" width=\"626\" height=\"337\" srcset=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-02-25.png 626w, https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-02-25-300x162.png 300w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\r\n<h3><b><span data-contrast=\"auto\">3. Updating a Record:<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/h3>\r\n<p><span data-contrast=\"auto\">a. Focus on updating a record in the 34 partition of the COW bucket.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span> <span data-ccp-props=\"{&quot;335551550&quot;:1,&quot;335551620&quot;:1}\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-19421 size-full\" src=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-02-25.png\" alt=\"\" width=\"626\" height=\"337\" srcset=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-02-25.png 626w, https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-02-25-300x162.png 300w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/span><\/p>\r\n<p><span data-contrast=\"auto\">b. Since you are using the COPY_ON_WRITE table type, a new Parquet file will be created for this update. You can find this file in the bucket located at <\/span><i><span data-contrast=\"auto\">warehouse\/cow\/transactions\/document=34.<\/span><\/i><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span> <span data-contrast=\"none\">Open<\/span><span data-ccp-props=\"{&quot;335551550&quot;:1,&quot;335551620&quot;:1}\">\u00a0<br \/><br \/><\/span><\/p>\r\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-19422 size-full\" src=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-03-35.png\" alt=\"\" width=\"628\" height=\"335\" srcset=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-03-35.png 628w, https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-03-35-300x160.png 300w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\r\n<h2 aria-level=\"2\"><b><span data-contrast=\"none\">What is Merge-On-Read (MOR)?<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:299,&quot;335559739&quot;:299}\">\u00a0<\/span><\/h2>\r\n<p><span data-contrast=\"auto\">Merge-On-Read (MOR) is an alternative storage type in Apache Hudi that employs a different approach to data management. Here\u2019s how it works:<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/p>\r\n<ul>\r\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"13\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">Base Parquet Files and Log Files:<\/span><\/b><span data-contrast=\"auto\"> In MOR, Hudi maintains a combination of base Parquet files alongside log files that capture incremental changes.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<ul>\r\n<li data-leveltext=\"\uf0b7\" data-font=\"Symbol\" data-listid=\"13\" data-list-defn-props=\"{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;\uf0b7&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"2\" data-aria-level=\"1\"><b><span data-contrast=\"auto\">On-the-Fly Merging:<\/span><\/b><span data-contrast=\"auto\"> When a read operation is executed, Hudi merges the base files and log files in real-time, providing the most up-to-date view of the data.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<p><span data-contrast=\"auto\">This approach allows for efficient handling of updates and inserts while enabling faster read operations, as the system does not need to rewrite entire files for every change.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/p>\r\n<h2 aria-level=\"2\"><b><span data-contrast=\"none\">Steps to Evaluate MOR<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:299,&quot;335559739&quot;:299}\">\u00a0<\/span><\/h2>\r\n<h3><b><span data-contrast=\"auto\">1. Open the Notebook:<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/h3>\r\n<ul>\r\n<li><span data-contrast=\"auto\">In your browser, navigate to <\/span><i><span data-contrast=\"auto\">hudi_mor_evaluation.ipynb.<\/span><\/i><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-19424 size-full\" src=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-05-45.png\" alt=\"\" width=\"627\" height=\"325\" srcset=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-05-45.png 627w, https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-05-45-300x156.png 300w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\r\n<h3><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559685&quot;:1440,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a02. <\/span><b><span data-contrast=\"auto\">Run Configuration Code:<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/h3>\r\n<ul>\r\n<li><span data-contrast=\"auto\">Execute all configuration-related code in the notebook.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}\">\u00a0<\/span><\/li>\r\n<li><span data-contrast=\"auto\">Ensure you specify the MERGE_ON_READ table type, as shown in the provided image.<\/span><\/li>\r\n<\/ul>\r\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-19425 size-full\" src=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-06-35.png\" alt=\"\" width=\"623\" height=\"330\" srcset=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-06-35.png 623w, https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-06-35-300x159.png 300w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\r\n<ol start=\"3\">\r\n<li>\r\n<h3><b><span data-contrast=\"auto\"> Updating a Record:<\/span><\/b><\/h3>\r\n<\/li>\r\n<\/ol>\r\n<ul>\r\n<li data-leveltext=\"%2.\" data-font=\"Aptos\" data-listid=\"11\" data-list-defn-props=\"{&quot;335552541&quot;:0,&quot;335559685&quot;:1440,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,4],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%2.&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"2\"><span data-contrast=\"auto\">Focus on updating a record in the 34 partition of the MOR bucket.<\/span><\/li>\r\n<li data-leveltext=\"%2.\" data-font=\"Aptos\" data-listid=\"11\" data-list-defn-props=\"{&quot;335552541&quot;:0,&quot;335559685&quot;:1440,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,4],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%2.&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}\" aria-setsize=\"-1\" data-aria-posinset=\"1\" data-aria-level=\"2\"><span data-contrast=\"auto\">Since you are using the MERGE_ON_READ table type, a new row-based file (e.g., Avro) will be created for this update. You can find this file in the bucket located at <\/span><i><span data-contrast=\"auto\">warehouse\/mor\/transactions\/document=34<\/span><\/i><span data-contrast=\"auto\">.<\/span>\u00a0<\/li>\r\n<\/ul>\r\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-19426 size-full\" src=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-08-13.png\" alt=\"\" width=\"630\" height=\"329\" srcset=\"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-08-13.png 630w, https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Screenshot-from-2024-11-12-19-08-13-300x157.png 300w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\r\n<h2 aria-level=\"3\"><b><span data-contrast=\"none\">Choosing Between COW and MOR<\/span><\/b><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;134245418&quot;:true,&quot;134245529&quot;:true,&quot;335559738&quot;:281,&quot;335559739&quot;:281}\">\u00a0<\/span><\/h2>\r\n<p><span data-contrast=\"auto\">The choice between COW and MOR in Apache Hudi largely depends on your specific requirements:<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/p>\r\n<ul>\r\n<li><b><span data-contrast=\"auto\">Read vs. Write Frequency<\/span><\/b><span data-contrast=\"auto\">: If your workload is read-heavy, COW may be the better choice due to its optimized read performance. Conversely, for write-heavy applications where data is ingested frequently, MOR can handle the load more efficiently.<\/span><\/li>\r\n<li><b><span data-contrast=\"auto\">Data Consistency<\/span><\/b><span data-contrast=\"auto\">: If your application requires strong consistency and atomicity during writes, COW is preferable. MOR is better suited for scenarios where eventual consistency is acceptable.<\/span><\/li>\r\n<li><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559685&quot;:720,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><b><span data-contrast=\"auto\">Use Case<\/span><\/b><span data-contrast=\"auto\">: For analytical workloads and batch processing, COW shines. MOR is often the way to go for real-time data processing and streaming applications.<\/span><span data-ccp-props=\"{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}\">\u00a0<\/span><\/li>\r\n<\/ul>\r\n<p><strong><a href=\"https:\/\/www.opstree.com\/contact-us?utm_source=Wordpress&amp;utm_medium=Blog&amp;utm_campaign=Deploying_Prometheus_and_Grafana_on_Kubernetes\" target=\"_blank\" rel=\"noreferrer noopener\">OpsTree<\/a><\/strong> is an End-to-End DevOps Solution Provider.<\/p>\r\n<!-- \/wp:paragraph -->\r\n\r\n<!-- wp:buttons -->\r\n<div class=\"wp-block-buttons\"><!-- wp:button -->\r\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/www.opstree.com\/contact-us?utm_source=Wordpress&amp;utm_medium=Blog&amp;utm_campaign=Deploying_Prometheus_and_Grafana_on_Kubernetes\" target=\"_blank\" rel=\"noreferrer noopener\">Contact Us<\/a><\/div>\r\n<!-- \/wp:button --><\/div>\r\n<!-- \/wp:buttons -->\r\n\r\n<!-- wp:paragraph {\"align\":\"center\"} -->\r\n<p class=\"has-text-align-center\"><strong>Connect with Us<\/strong><\/p>","protected":false},"excerpt":{"rendered":"<p>Apache Hudi (Hadoop Upserts Deletes and Incrementals) is a powerful framework designed for managing large datasets on cloud storage systems, enabling efficient data ingestion, storage, and retrieval. One of the key features of Hudi is its support for two distinct storage types: Copy-On-Write (COW) and Merge-On-Read (MOR). Each of these storage strategies has unique characteristics &hellip; <a href=\"https:\/\/opstree.com\/blog\/2024\/11\/12\/understanding-cow-and-mor-in-apache-hudi-choosing-the-right-storage-strategy\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Understanding COW and MOR in Apache Hudi: Choosing the Right Storage Strategy\u00a0&#8220;<\/span><\/a><\/p>\n","protected":false},"author":244582680,"featured_media":19428,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_coblocks_attr":"","_coblocks_dimensions":"","_coblocks_responsive_height":"","_coblocks_accordion_ie_support":"","jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[28070474],"tags":[3768,768739390],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/opstree.com\/blog\/wp-content\/uploads\/2024\/11\/Understanding-COW-and-MOR-in-Apache-Hudi.png","jetpack_likes_enabled":true,"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/pfDBOm-51n","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/posts\/19305"}],"collection":[{"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/users\/244582680"}],"replies":[{"embeddable":true,"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/comments?post=19305"}],"version-history":[{"count":8,"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/posts\/19305\/revisions"}],"predecessor-version":[{"id":30004,"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/posts\/19305\/revisions\/30004"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/media\/19428"}],"wp:attachment":[{"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/media?parent=19305"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/categories?post=19305"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/opstree.com\/blog\/wp-json\/wp\/v2\/tags?post=19305"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}