Encrypt Student Dat MapReduce Assignment

Move the file from local to hdfs

hdfs dfs -copyFromLocal "C:\\Users\\labuser\\Desktop\\MCA54\\EncrypDataProject\\assets\\symca-students.csv" /mca54

Check whether the file is copied to hdfs
```
hdfs dfs -ls /mca54
```
Check hadoop user interface (Namenode Status) and the File we’ve copied just now.
Now create one Java Project using vs code (EncryptDataProject)
Copy jar files from hadoop (hdfs > common and mapreduce dir) and paste in lib folder of java project

Java file EncryptStudentData.java

import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import java.io.IOException;

public class EncryptStudentData {

    public static class EncryptMapper extends Mapper<Object, Text, Text, NullWritable> {
        private Text encryptedLine = new Text();

        private static final int SHIFT = 3;

        private String encrypt(String input) {
            StringBuilder sb = new StringBuilder();
            for (char c : input.toCharArray()) {
                if (Character.isUpperCase(c)) {
                    char enc = (char) ((c - 'A' + SHIFT) % 26 + 'A');
                    sb.append(enc);
                } else if (Character.isLowerCase(c)) {
                    char enc = (char) ((c - 'a' + SHIFT) % 26 + 'a');
                    sb.append(enc);
                } else if (c == ' ') {
                    sb.append('_'); // Obfuscate spaces with underscore
                } else {
                    sb.append(c); // Keep punctuation, digits as is
                }
            }
            return sb.toString();
        }

        @Override
        public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
            String line = value.toString();
            String encrypted = encrypt(line);
            encryptedLine.set(encrypted);
            context.write(encryptedLine, NullWritable.get());
        }
    }

    public static void main(String[] args) throws Exception {
        if (args.length != 2) {
            System.out.println("Usage: EncryptStudentData <input dir> <output dir>");
            System.exit(-1);
        }

        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf, "Encrypt Student Data");
        job.setJarByClass(EncryptStudentData.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setMapperClass(EncryptMapper.class);
        job.setNumReduceTasks(0);  // Map-only job

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(NullWritable.class);

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}

In Java, specifically within the Apache Hadoop framework, NullWritable is a special implementation of the Writable interface. It serves as a placeholder for a null value when a key or value is not required in the MapReduce paradigm.

Explanation

Create output folder inside java project
1. command to compile EncryptStudentData.java file
```
javac --release 8 -cp "lib/*" -d output "src/EncryptStudentData.java"
```
2. command to create a jar file for the EncryptStudentData program
```
jar -cvf src/EncryptStudentData.jar -C output/ .
```
command to run hadoop jar
```
hadoop jar C:\\Users\\labuser\\Desktop\\MCA54\\EncrypDataProject\\src\\EncryptStudentData.jar EncryptStudentData /mca54/symca-students.csv /mca54/output/EncryptStudentData
```
"C:....jar" is the path of the jar file present in the local system

"/mca54/symca-students.csv" is the path of the input files present on the hadoop server.

"/mca54/output/EncryptStudentData" is the path of the output folder where i wish to upload all the output files on the hadoop server.