Let's Build a Thing: NPM Dependency Checker - Part 5

Picture shows Jason A. Martin - Software Engineer, Indie Game Developer, Tech Evangelist, Entrepreneur.

note: Series start along with dev notes: Let's Build: NPM Dependency Checker

Previous part: Let's Build: NPM Dependency Checker - Part 3

Major Refactor - Oh My Code

GitHub: NPM Dependency Checker branch: iterate-and-collect

As of this moment, NPM Dependency Checker has some serious problems. Take a look:

  • It's parsing a repo url that may or may not be there.
  • It's then transforming that url (if there), which may be in several formats, which means we need all sorts of String.replace calls to be somewhat effective.
  • It's trying to follow a HTTP path to get more info (again, if the url is there and if we transformed it ok).
  • It's grabbing devDependencies, which aren't going to be recursively installed. Instead it should just report the main dependencies.

The good news is that we don't need to do any of this to get the dependencies. When we call npm view, we're actually given the dependencies for that repo, so we can just pull our data from there and not do any web scraping.

New Operations

Our new code is going to operate more simply and cleanly. Here's how we're going to have it operate:

  1. Start by using npm view for a repo.
  2. Grab the dependencies from that repo and put it into a pending dependencies list.
  3. Add the repo we just checked to a completed dependencies list.
  4. Recursively call a function that each time takes the first dependency off the pending list, runs npm view, puts that dependency on the completed list, scrubs the new incoming dependencies against the pending list and adds whatever is new to it.
  5. This process will continue until the pending list is empty and at that time we'll report the number of dependencies a package has and write a txt file as well.

Config Change

We're going to add two configuration items in our Mix config so that our application knows where to store the final txt file.

Open up /config/config.exs/ and add the following two config statements under Use.Mix.Config:

## Change the working_directory to a real path on your machine
config :ndc, Ndc, working_directory: "/Some/directory/here/"
config :ndc, Ndc, working_filename: "dep_list.txt"

The New ndc.ex

To see all the updated code, go to this application's GitHub and select the branch for this step.

Here are some of the highlights of the changes:

decode_body: It's possible that the package has no dependencies or that a package is calling a bogus repo. To prevent our application from quitting, I've created two functions that guard against these situations.

def decode_body("undefined\n", _) do
    [dependencies: %{}]
  end

def decode_body("", _) do
  [dependencies: %{}]
end

npm_view: Because versions matter for dependencies, I've updated the npm_view call to include a version for the package.

def npm_view(package, version) when is_binary(package) do
  repo = "#{package}@#{version}"
  System.cmd("npm", ["view", "--json", repo])
end

display_package_information: We want to use recursion and Elixir's pattern matching to provide a clean solution (or at least cleaner than it would have been).

I've created a function (get_package_information) that has three versions. It could be called with /2, /1 or /1 with a specific atom. Let's go over each one as this is the meat of the application.

get_package_information/2

This function takes two arguments: a pending dependency list and a completed one. It's a recursive function, so it will keep calling itself.

When this function is executed it will take the first element in the pending_dependency_list and run a pipe of functions on it. The result is repo, which ends up being a list of dependencies.

We then do a bit of scrubbing to see what incoming dependencies are actually new and then we add them to the pending_dependency_list.

We also add the used package for this call to the complete_dependency_list.

At this point update_pending is a list of what's left to do. If that list has more to do, the function calls itself with /2 again. If not, it calls itself with just the complete_dependency_list.

def get_package_information(pending_dependency_list, all_dependencies_list) do
    package = elem(List.first(pending_dependency_list),0)
    version = elem(List.first(pending_dependency_list),1)
      repo = package
        |> npm_view(version)
        |> elem(0)
        |> decode_body(@npm_view_fields)
        |> parse_dependencies

      ## Scrubbing deps from npm_view call against pening+all deps. Remaining are added to pending.
      ## Make map sets
      [current_dependency | remaining_dependencies] = pending_dependency_list

      dependency_union = MapSet.union(MapSet.new(all_dependencies_list), MapSet.new(remaining_dependencies))
      scrubbed_dependencies = MapSet.difference(MapSet.new(repo), dependency_union)
      update_pending = MapSet.to_list(scrubbed_dependencies)++remaining_dependencies
      complete_dependency_list = [current_dependency]++all_dependencies_list

    ## get next dep if exists and recall. If not, recall with just 1 arg.
    cond do
      (length(update_pending) > 0) ->
        display_package_information(update_pending, complete_dependency_list)
      true ->
        display_package_information(complete_dependency_list)
    end
end

get_package_information/:error404

It's possible that a user starts the application with a package that doesn't exist. Rather than creating a differently named function, I'm just using pattern matching.

When the atom :error404 is passed to get_package_information, the user is informed that nothing was found.

def get_package_information(:error404) do
  IO.inspect "Nothing found."
end

get_package_information/1

Finally, if only one argument is passed into get_package_information and it isn't the error Atom, it means we're all done processing and we need to display information.

When this function is called, it outputs all the dependencies to the screen along with notices displaying the dependency count and location of the saved file.

def get_package_information(all_dependencies_list) do
  IO.inspect all_dependencies_list
  IO.puts "Total number of dependencies: #{length(all_dependencies_list)}"
  IO.puts "Dependencies were saved into #{@working_filename} in directory #{@working_directory}."

  final_list = Enum.reduce all_dependencies_list, [], fn {k, v}, acc ->
    updated_version = v
      |> String.replace("^", "")
      |> String.replace("~", "")
      |> String.replace("%", "")
    ["#{k}:#{updated_version}\n"]++acc
  end

  File.write!(Path.absname("#{@working_directory}#{@working_filename}"), List.to_string(final_list))
end

Sample Output

There was a lot of refactor work on this application so let's do a test and make sure it's working before we proceed.

$ ./ndc --pkg=react

output

NPM starting package: react
[{"ua-parser-js", "^0.7.9"}, {"asap", "~2.0.3"}, {"promise", "^7.1.1"},
 {"loose-envify", "^1.0.0"}, {"whatwg-fetch", "^0.8.2"},
 {"iconv-lite", "~0.4.4"}, {"encoding", "^0.1.11"}, {"node-fetch", "^1.0.1"},
 {"isomorphic-fetch", "^2.1.1"}, {"immutable", "^3.7.6"}, {"core-js", "^1.0.0"},
 {"fbjs", "^0.8.4"}, {"js-tokens", "^1.0.1"}, {"loose-envify", "^1.1.0"},
 {"object-assign", "^4.1.0"}, {"react", "latest"}]
Total number of dependencies: 16
Dependencies were saved into dep_list.txt in directory /Users/MEMEME/Desktop/.

Sweet! We have it working and really we're about done. There's some cleanup to do and maybe we add a feature or two. For now, let's call it a day and regroup for the final part.

I encourage you to look up the code on GitHub to make sure you have the latest.

Final Part

Expect the final part by the end of September 2016.