The Top 3 by E3
Welcome to E3 Consulting's The Top 3 by E3! We are delighted that you are taking the time to check out our series on the profession of Independent Engineering. Our podcast aims to introduce listeners to project finance and engineering. During each episode, we will examine a topic we encounter in our daily lives as technical advisors. Topics will range from the profession of Independent Engineering to hydrogen, wind, solar, and energy storage, among many others. While we can't touch on everything about a topic during our series, we will provide listeners with the "top three" takeaways. We want to thank Joseph McDade for allowing us to use his music, Elevation, as our theme. Please check him out at https://josephmcdade.com.Again, thanks for listening, and if you have any suggestions for upcoming topics, please reach out to us at e3co@e3co.com. The E3 Crew
The Top 3 by E3
Series Episode Two: Climate Data in PVsyst
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Daniel Tarico and Frances Wilberg-Plourde delve into PV system modeling, specifically focusing on using climate data in PVsyst software. The discussion covers how climate data, such as irradiance, temperature, and wind speed, are critical for modeling PV system performance and affect energy production. The data is typically provided in an 8760 format, representing each hour of the year.
Frances explains that PVsyst relies on typical meteorological year (TMY) files, which aggregate data from several years to represent an average year. These files are compiled from various data sources, each with its own methods and algorithms for processing weather data, including satellite and ground-based measurements.
Three primary climate data sources are discussed: the National Solar Radiation Database (NSRDB), Meteonorm, and Solar Anywhere. The NSRDB offers free data with decent accuracy, though it tends to underestimate irradiance and is not recommended for final modeling. Meteonorm, a paid service, offers synthetic data that can be useful for preliminary models but may not be precise enough for financing decisions. Solar Anywhere is considered the most accurate but comes with a cost, providing high spatial and temporal resolution data, which is especially useful for detailed production estimates and project evaluations.
Key takeaways include the importance of using location-specific data for accurate modeling, understanding the differences between data sources and their methods, and the potential benefits of paid data services for high-accuracy predictions, particularly for long-term financial planning.
Introduction to PV System Modeling
Speaker 1Hello, welcome to the Top 3 by E3, a monthly podcast about the intersection between engineering, energy and project finance. I am Daniel Tarrico, a director of renewables at E3, and I'll be your host today. And today I am here with Frances Wilberg-Plurd. She's a project manager here at E3. This is the second in a series of podcasts where Frances and I are discussing several topics related to photovoltaic system modeling and production estimation. This specific podcast will provide an overview of the different climate data which are used in PV system modeling. Welcome, Frances.
Speaker 2Thank you, dan. It's great to be here discussing this with you today. As we've said before in our series, we're going to be focusing our discussion today on PVSYST, which is the industry standard modeling software for PV systems, but the principles that are outlined in this discussion can really apply to any PV system modeling program that uses climate data sources such as the ones we're going to be discussing.
Speaker 1That's great. Well, let's just get right to it. Two questions are outstanding here. How are these data used within the PVSYST model, and why is it important to understand the sources of climate data used in these programs?
Speaker 2is it important to understand the sources of climate data used in these programs. So PVSYST uses climate data profiles to model how PV systems are going to perform in a given location. So these profiles include data such as irradiance, which is incident sunlight how much sunlight is actually hitting the modules that can be turned into electricity ambient temperature so the temperature of the area right around the modules and wind speed. And all of these things drastically impact PV system production.
Speaker 2This climate data is usually presented in what we know as 8760 format, because there are 8,760 hours in a 365-day year. So an 8760 file contains one data entry for each type of measured data for each hour of the year. So for each hour of each year you have a data input for irradiance, temperature, wind speed, etc. And PVSys uses this data to determine several aspects of PVSys performance. Irradiance data can indicate how much sunlight is available to be converted into electricity at that location, and then ambient temperature and wind speed. Both impact PV module temperature, which impacts PV module production, since PV module temperature has an inversely correlated impact on PV system production, has an inversely correlated impact on PV system production. Pv models tend to perform better when they're at lower temperatures. So by using detailed climate data, we can have more precise modeling of PV system production.
Speaker 1Okay, so to start, we have climate data that has irradiance, temperature and wind speed for every hour of the year, the 8760 file. Is that what is used to generate the typical meteorological year or?
Speaker 2TMY file. Yes, that's exactly where this discussion is heading. So the climate data profiles are primarily based on historical climate data from the location of the PV plant that you're looking at. These real-world data points are averaged into what is known as a typical meteorological year, or a TMY file, and this is intended to present a typical year of climate data for that location. It effectively acts as a historical average.
Speaker 2And so those TMY files are created by processing data from a specific period of time, so say 50 or 60 years, of whatever data is available, and specific months within that data set are identified to represent the average behavior of the data set as a whole from that specific month month. So, for example, if you have a data set of climate data that spans from the years 1985 through 2024, january of 2021 might appear statistically to represent the average of that month, so the average of a January data set relative to the rest of the data set. So you're pulling out months from within that data set that represent the closest to the average of that month at that location.
Speaker 2So you might have January of 2021, april of 2015, november of 1998. And then these months are separated out and minimally processed to make sure that they flow well with the rest of the data set, to make sure you're not having jarring changes from one month to the next. And then this collection of average months is used as the historically representative data set for the year at that location.
Speaker 1Okay, that's pretty interesting. I didn't know that a typical year was actually assembled from a collection of what would be 12 typical months or average months. So how are those?
Speaker 2months selected. So, as we said, they're based on an average of what that month's performance looks like on average at that location, and so the determination of what constitutes an average month is weighted based on the relative importance of certain weather data categories. Waiting for TMY file creation, because you're mostly concerned about how much sunlight that location is going to get in a particular month of the year, because that's what's most directly connected to PV system production.
Speaker 1So is there a standardized method for converting the weather data into typical months and years for the purpose of production modeling then?
Speaker 2So the algorithms for determining how each variable is weighted and each average month is determined are quite complicated, and some data sources may have proprietary algorithms to produce these types of data, and the accuracy of the assumptions made in each climate data program's algorithms can really impact the accuracy of the resulting climate data profile. So it's important to use the climate data profiles that are known to be reliable and to have reliable algorithms that they're using. So specifically, if you're overestimating or underestimating the amount of solar irradiance that's available at a given location in a particular month, that can have a huge impact on modeled energy production from that site, which can then lead to financial losses or electrical load imbalance issues if the estimation that you're using was based on less than accurate data.
Speaker 1Right, so yeah, no surprise. So the quality of weather data is critical really to the actual accuracy of the model, and this naturally leads to the next question what are some of the sources of climate data that are used in PV system models? What measurements do they take and how do they acquire that data?
Speaker 2So there are quite a few climate data programs that are used in PV system modeling applications, each of which uses different sources and different algorithms to process their data and develop their climate data profiles. Pvsyst allows users to upload climate data profiles from several different platforms, some of which are free to the user and some of which require a paid subscription. So in this short discussion today, we're going to cover some of the three commonly used data programs, which are the National Solar Radiation Database, or the NSRDB. Medionorm is the second one, and then Solar Anywhere is the third one.
Speaker 1We're going to discuss.
Speaker 2So, starting with the NSRDB, that is a free platform that is produced by NREL, which is the National Renewable Energy Laboratory here in the US, and this data program contains meteorological data from North and South America and India and select other international locations. They're adding more locations to this data set every year. Prior to 1998, the NSRDB generated their TMY their typical meteorological year files, from weather data collected at about 1,500 actual physical weather stations located throughout the United States, so that was based on actual ground measured climate data. Since 1998, the NSRDB has shifted to entirely using meteorological data collected by satellites geosynchronous orbit, so they're constantly orbiting the globe and they're collecting climate data at the locations specifically instead of having that data measured by on-site weather stations.
Speaker 1Right. So, as of now, essentially all the data are collected using satellites that are in stationary orbits, which are really basically staring down at the same part of the Earth all the time, right.
Speaker 2Yeah, that's mostly correct. So the AD760 solar radiation data is collected from satellites that are observing all day, every day, and the satellite data is then used to estimate the solar irradiance conditions at any location within the satellite's range using something called the physical solar model or the PSM, and the PSM uses specific knowledge of cloud properties and other algorithms to determine how much extraterrestrial solar radiation is reaching the Earth at specific locations, includes adjustments for things like aerosols, water vapor, temperature, wind speed, relative humidity and dew point data that is collected from those satellites and then modeled using behavior models of those properties, since different locations that are receiving the same amount of extraterrestrial radiation before it goes through the atmosphere might have different levels of irradiance once it reaches the ground because of cloud cover or aerosol cover or other weather properties. So the data that is generated by the physical solar model has a time resolution of about 30 minutes and a spatial resolution of four kilometers, or about 2.4 miles.
Speaker 1Gotcha so that's a pretty sophisticated bit of science to actually collecting the data. So what this means really is from a practical standpoint is each pixel of weather data is about six square miles, which is a pretty large-sized solar site, and there are two measurements per hour, or 17,520 per year.
Major Climate Data Sources Compared
Speaker 2Yeah, and that is the case for the NSRDB. We're now going to move into discussing one of the other climate data sources, which is called Medionorm, and this is a data set more widely used in Europe because most of its measured data is centered in Europe. So Medionorm is a paid service, but if you have a PVSYS license, a PVSYS subscription, you can access Medionorm datasets for free. They're included within your subscription. So, depending on the location of the site that's selected by the user within the geographical coordinates, medionorm may use satellite data, it may use measured ground data or a mixture of the two to estimate solar radiation available at that location. So, depending on where you are, you might have more or less accurate data set from Medionorm.
Speaker 2So, regardless of the data source, the Medio data is processed to determine the daily mean of each data category, which are then summed to determine the average monthly values for that location in question. After the average monthly values are determined, hourly data is then generated using statistical models to create what Medionorm calls a synthetic data file, calls a synthetic data file. The synthetic files that MeteoNorm generates for a location are analogous to the TMY, the typical meteorological year file. They don't represent an actual historical year, but a hypothetical year, which statistically represents average weather patterns for a typical year at that location. Outside of Europe, meteonorm has a spatial resolution of about eight kilometers, which is half of NSRDB, so it's less precise than the NREL data for locations in the United States and the rest of North America.
Speaker 1Okay, well, that's interesting. So, if I understand this correctly, the median norm assembles a typical year from a collection of average days of the year, whereas the in-row data looks back over a long period and assembles a typical year from a collection of what are more or less average months. Is that correct?
Speaker 2Yeah, that's a very good summary.
Speaker 2And the third data source we're going to discuss today is called Solar Anywhere, and Solar Anywhere is a paid service which is considered to be one of the most accurate sources of climate data for PV modeling purposes.
Speaker 2Solar Anywhere is a private company that produces meteorological data files for almost any location on Earth, and these files can be purchased from the Solar Anywhere website to be loaded into PVSys or can be accessed through PVSys directly. The irradiance data that's produced by Solar Anywhere is derived from a satellite model called the Perez satellite model, which is a method of determining the ground solar irradiance based on satellite data. So when data is collected at the satellite, it's processed through this model to determine what the actual irradiance received on the ground by the solar panels would be, and this is considered to be one of the most accurate satellite data processing models that's presently available. So, because this is a paid service, solar Anywhere does provide much finer spatial and temporal resolutions than other media data sources, so their spatial resolution options range from 10 kilometers to just 500 meters for some sites, which is incredibly fine, and customers can choose to export historical weather data for previous years or look at something called typical GHI, which is analogous.
Speaker 1The GHI is the global horizontal insulation. Is that right?
Speaker 2Yes, so GHI refers to global horizontal insulation, which is a measure of irradiance, and so a TGY file is analogous to a TMY file, but instead of looking at all meteorological data sets and kind of weighing them. The TGY looks only at irradiance and is only focused on creating a typical yearly file based on irradiance, and so when you're looking at PV system modeling, irradiance is obviously the meteorological data point that you're most interested in. And so the TGY is proprietary to Solar Anywhere and is used as the main data set for Solar Anywhere applications.
Speaker 1Okay.
Speaker 2So Solar Anywhere claims that the use of GHI exclusively and their TGY files result in more accurate solar prediction data.
Speaker 1Okay, well, we're kind of getting into the weeds a bit, but if I was to summarize each of these available data sets, each of them uses satellite data, though the satellites have somewhat different capabilities, different resolution, that they process the data using proprietary methods, and this gives us somewhat different results when you use the data with PVSys performance modeling software. Now is that a good summary?
Speaker 2Yeah, yeah, that's about right.
Speaker 1Okay, so what then, are some of the benefits and drawbacks of each data source? That's about right, okay, so what?
Speaker 2then are some of the benefits and drawbacks of each data source. So the benefit of the NSRDB data sets is that they're free, they're very easily accessible and they're relatively accurate. They're actually quite accurate. So in E3's experience the NSRDB tends to produce a worst-case scenario. It tends to underestimate some components of irradiance. So we tend not to recommend use of it for PVSYS modeling for that reason. But they can be very useful in preliminary estimations and initial project evaluation. Medionorm is also a free service if you already have a PVSYS license. But because it has lower spatial resolution outside of Europe and because it uses what they call synthetic data sets, this may make it less desirable for PV system modeling in North America or other areas outside of Europe. So in my experience what I've seen is that Medionorm is relatively on par with the NSRDB in terms of model accuracy.
Speaker 2So it can be useful in preliminary modeling, but it may not be precise enough for finalized production estimates to be used in financing.
Speaker 2Solar Anywhere is considered to be one of the most accurate climate data sources presently available because of its high spatial and temporal resolution. It has a unique satellite model, and then their TGY data sets Solar Anywhere data profiles can cost up to $500 per site, depending on what your subscription is, but the additional cost is definitely worth it to developers and site owners who want to ensure highest accuracy in their production estimates and project valuations.
Benefits and Drawbacks of Data Sources
Speaker 1Yeah, that makes sense. Well, that's a good summary. With that in mind, I would ask you what are your top three takeaways from this discussion that would be valuable to a developer or someone else who's relying on PV system?
Speaker 2modeling. Number one that PV system modeling programs need to use location-specific satellite climate data profiles to predict system production in specific locations. So we're designing, we're wanting to make sure that the models that we're developing to predict PV system production actually are tailored to their specific location and the specific meteorological conditions that exist at those locations. Number two that different climate data programs develop their climate data profiles differently, using different data sources, different assumptions and different algorithms. So it's important to know how each program compiles their data sets to be able to make the choice for what data set you're going to be using.
Speaker 2And number three, that paid sources of climate data may be beneficial to ensuring production forecast accuracy. So, especially if you're looking at something that's going to be financed over several decades, you want to make sure you're getting as accurate of a view of the climate data at that location as possible, and so, even if that means you have to pay for a subscription or a climate data at that location is possible, and so, even if that means you have to pay for a subscription or a climate data source for that model, that may be well worth it in the long run.
Top Three Takeaways
Speaker 1Yeah, yeah, it's a lot of value in the accuracy of that estimation, to be sure. Well, thank you, frances. You know I've been around PVSYS for a long time, didn't necessarily understand how it all worked, and so I learned a couple new things here, and I do appreciate that For our audience, this is another Top 3 by E3. Thank you for listening. We are looking forward to continue our discussions during this podcast series. If you have any questions for me or Francis or the rest of the E3 team, or if you have a suggestion for a future podcast topic, please feel free to reach out to us via email at E3COcom. We look forward to hearing from you and have a good day.